The architecture of the Peak detector consists of three kernels. A peak_detect kernel and two postprocessing kernels, data_shuffle and upscale. The peak_detect kernel takes the vector input data of type int32 and size 16 for every iteration and computes, 1) max(16-lane input) 2) an expression on minimum value using APIs and sends it over stream and buffer respectively. The stream output is broadcasted to the two kernels, data_shuffle and upscale, for postprocessing.
The complete design is shown in the Vitis analyzer.