Select the
Graphview to examine the design. Selectp_dto identify the tile as (25,0).Adjust the trace view to the correct size with the zoom in or zoom out icons, and move the marker to the end of
peak_detector the beginning of_main. This is considered as the beginning of an iteration. A period of lock stall indicates data is sent from the PL to AIE tile.Observe the end of the
peak_detectkernel corresponding to the core(25,0) and start of the core(24,0) and core(25,1). If you observe the graph view, you can notice that the kernelpeak_detectsends data to both theupscaleanddata_shufflekernels. The same behavior can be observed in the trace view as well.You can calculate the execution time of one iteration as follows. Place the marker at the start and end of the iteration and (1) - (2) gives 262.2 ns which is ~= 329 cycles. This matches with the
Function timein the profile data from both the AI Engine simulation and hardware emulation.