Description
printf()
is also not
supported with variables used in pipes.The OpenCL framework 2.0 specification introduces a new memory object called pipe. A pipe stores data organized as a FIFO. Pipes can be used to stream data from one kernel to another inside the FPGA without using the external memory, which greatly improves the overall system latency.
pipe int p0 __attribute__((xcl_reqd_pipe_depth(512)));
Pipes can only be accessed using standard OpenCL
read_pipe()
and write_pipe()
built-in functions in non-blocking mode, or using Xilinx-extended read_pipe_block()
and write_pipe_block()
functions in blocking mode.
Pipe objects are not accessible from the host CPU. The status of pipes can
be queried using OpenCL
get_pipe_num_packets()
and get_pipe_max_packets()
built-in functions. For more details on these built-in
functions, see
The OpenCL C
Specification
from Khronos OpenCL
Working Group.
Syntax
This attribute must be assigned at the declaration of the pipe object:
pipe int <id> __attribute__((xcl_reqd_pipe_depth(<n>)));
Where:
- <id>
- Specifies an identifier for the pipe, which must consist of lower-case alphanumerics. For example, <infifo1> not <inFifo1>.
- <n>
- Specifies the depth of the pipe. Valid depth values are 16, 32, 64, 128, 256, 512, 1024, 2048, 4096, 8192, 16384, 32768.
Examples
The following is the dataflow_pipes_ocl
example from Xilinx GitHub that use pipes to pass data from one
processing stage to the next using blocking read_pipe_block()
and write_pipe_block()
functions:
pipe int p0 __attribute__((xcl_reqd_pipe_depth(32)));
pipe int p1 __attribute__((xcl_reqd_pipe_depth(32)));
// Input Stage Kernel : Read Data from Global Memory and write into Pipe P0
kernel __attribute__ ((reqd_work_group_size(1, 1, 1)))
void input_stage(__global int *input, int size)
{
__attribute__((xcl_pipeline_loop))
mem_rd: for (int i = 0 ; i < size ; i++)
{
//blocking Write command to pipe P0
write_pipe_block(p0, &input[i]);
}
}
// Adder Stage Kernel: Read Input data from Pipe P0 and write the result
// into Pipe P1
kernel __attribute__ ((reqd_work_group_size(1, 1, 1)))
void adder_stage(int inc, int size)
{
__attribute__((xcl_pipeline_loop))
execute: for(int i = 0 ; i < size ; i++)
{
int input_data, output_data;
//blocking read command to Pipe P0
read_pipe_block(p0, &input_data);
output_data = input_data + inc;
//blocking write command to Pipe P1
write_pipe_block(p1, &output_data);
}
}
// Output Stage Kernel: Read result from Pipe P1 and write the result to
// Global Memory
kernel __attribute__ ((reqd_work_group_size(1, 1, 1)))
void output_stage(__global int *output, int size)
{
__attribute__((xcl_pipeline_loop))
mem_wr: for (int i = 0 ; i < size ; i++)
{
//blocking read command to Pipe P1
read_pipe_block(p1, &output[i]);
}
}