We’ve already run the C Synthesis for Version3
, so let’s first investigate and understand the results of the Synthesis Report.
Under C Synthesis, expand reports and select Synthesis.
This view shows the synthesis results. The first section shows the Timing Estimate, and the conservative clock period of 100 MHz was easily achieved. The second section shows the Performance and Resource Estimates. In this section, we can see the performance and utilization of each function, sub-function, and/or loop in the code. Two key performance metrics are latency and interval. Latency measures the time from the beginning to the end of the process. Interval is measures the minimum amount of time between successive calls to the module, making it a better measurement for throughput. In addition, we can see a number of utilization metrics are provided, such as BRAM, DSP, FF, and LUT.
During this tutorial, we optimized the function ntt
, and the performance and utilization for that function is provided in the 4th row. We can focus on the cycle performance:
The II is 770 and is matching the longest II of its subfunctions/ processes which is
ntt_stage6()
,Its Latency is 2114 which is the sum of all the latencies of its subfunctions/ processes.
For polyvec_ntt_loop, the loop tripcount is 128 as expected and we call the ntt()
function, we can see that the latency is 99905 which is coming from the computation 127*770+2114+1. Because the tripcount is 128, we need 127 restarts of the iteration at II=770 and the last iteration we need to wait the full latency of 2114 cycles to get all the outputs.