Now, build the kernel using the Vitis compiler. The Vitis compiler will call the Vitis HLS tool to synthesize the C++ kernel code into an RTL kernel. You will also review the reports to confirm if the kernel meets the latency/throughput requirements for your performance goals.
Use the following command to build the kernel.
cd $LAB_WORK_DIR/makefile; make build STEP=single_buffer PF=4 TARGET=hw_emu
This command will call the
v++
compiler which then calls the Vitis HLS tool to translate the C++ code into RTL code that can be used to run Hardware Emulation.NOTE: For purposes of this tutorial, the number of input words used is only 100 because it will take a longer time to run the Hardware Emulation.
Then, use the following commands to visualize the HLS Synthesis Report in the Vitis analyzer.
vitis_analyzer ../build/single_buffer/kernel_4/hw_emu/runOnfpga_hw_emu.xclbin.link_summary
The
compute_hash_flags
latency reported is 875,011 cycles. This is based on total of 35,000,000 words, computed with 4 words in parallel. This loop has 875,000 iterations and including theMurmurHash2
latency, the total latency of 875,011 cycles is optimal.The
compute_hash_flags_dataflow
function hasdataflow
enabled in the Pipeline column. This function is important to review and indicates that the task-level parallelism is enabled and expected to have overlap across the sub-functions in thecompute_hash_flags_dataflow
function.The latency reported for
read_bloom_filter
function is 16,385 for reading the Bloom filter coefficients from the DDR using thebloom_filter maxi
port. This loop is iterated over 16,000 cycles reading 32-bits data of from the Bloom filter coefficients.
The HLS reports confirm that the latency of the function meets your target. You still need to ensure the functionality is correct when communicating with the host. In the next section, you will walk through the initial host code and run the software and hardware emulation.