Host code ‘host.cpp’ will be running on the host processor which conatins the code to initialize and run the datamovers and the ADF graph. XRT APIs are used to create the required buffers in the device memory.
First a golden reference image is generated using OpenCV
int run_opencv_ref(cv::Mat& srcImageR, cv::Mat& dstRefImage, float coeff[9]) {
cv::Mat tmpImage;
cv::Mat kernel = cv::Mat(3, 3, CV_32F, coeff);
cv::filter2D(srcImageR, dstRefImage, -1, kernel, cv::Point(-1, -1), 0, cv::BORDER_REPLICATE);
return 0;
}
Then, xclbin is loaded on the device and the device handles are created
xF::deviceInit(xclBinName);
Buffers for input and output data are created using the XRT APIs and data from input CV::Mat is copied to the XRT buffer.
void* srcData = nullptr;
xrtBufferHandle src_hndl = xrtBOAlloc(xF::gpDhdl, (srcImageR.total() * srcImageR.elemSize()), 0, 0);
srcData = xrtBOMap(src_hndl);
memcpy(srcData, srcImageR.data, (srcImageR.total() * srcImageR.elemSize()));
// Allocate output buffer
void* dstData = nullptr;
xrtBufferHandle dst_hndl = xrtBOAlloc(xF::gpDhdl, (op_height * op_width * srcImageR.elemSize()), 0, 0);
dstData = xrtBOMap(dst_hndl);
cv::Mat dst(op_height, op_width, srcImageR.type(), dstData);
xfcvDataMovers objects tiler and stitcher are created. For more details on xfcvDataMovers refer xfcvDataMovers
xF::xfcvDataMovers<xF::TILER, int16_t, MAX_TILE_HEIGHT, MAX_TILE_WIDTH, VECTORIZATION_FACTOR> tiler(1, 1);
xF::xfcvDataMovers<xF::STITCHER, int16_t, MAX_TILE_HEIGHT, MAX_TILE_WIDTH, VECTORIZATION_FACTOR> stitcher;
ADF graph is initialized and the filter coefficients are updated.
filter_graph.init();
filter_graph.update(filter_graph.kernelCoefficients, float2fixed_coeff<10, 16>(kData).data(), 16);
Metadata containing the tile information is generated.
tiler.compute_metadata(srcImageR.size());
The data transfer to AIE via datamovers is initiated along with graph run and further execution waits till the data transfer is complete.
auto tiles_sz = tiler.host2aie_nb(src_hndl, srcImageR.size());
stitcher.aie2host_nb(dst_hndl, dst.size(), tiles_sz);
std::cout << "Graph run(" << (tiles_sz[0] * tiles_sz[1]) << ")\n";
filter_graph.run(tiles_sz[0] * tiles_sz[1]);
filter_graph.wait();
tiler.wait();
stitcher.wait();