The TP_NUM_FRAMES
template parameter can be used to drive the kernel to operate on multiple frames on a given iteration. When TP_NUM_FRAMES
is set to 1, the kernel will operate on a single frame with FRAME_SIZE (TP_POINT_SIZE
zero-padded for alignment). However, when the number of frames is greater than one, the input buffer of data will contain TP_NUM_FRAMES
batches of FRAME_SIZE input data for kernel to operate on. Processing larger buffers reduces kernel execution overheads and therefore can increase the throughput of such design. On the other hand, processing larger amounts of data in a single kernel execution iteration leads to increased latency.