If a kernel is capable of accepting more data while it is still operating on
data from the previous transactions, XRT can send the next batch of data. The kernel
then works on multiple data sets in parallel at different stages of the algorithm, thus
improving performance. To support host-to-kernel dataflow, the kernel has to implement
the ap_ctrl_chain
protocol using the
pragma HLS
interface
for the
function return:
void kernel_name( int *inputs,
... )// Other input or Output ports
{
#pragma HLS INTERFACE ..... // Other interface pragmas
#pragma HLS INTERFACE ap_ctrl_chain port=return bundle=control
Important: To take advantage of
the host-to-kernel dataflow, the kernel must also be written to process data in stages,
such as pipelined at the loop-level as discussed in Loop Pipelining,
or pipelined at the task-level as discussed in Dataflow Optimization.