Accessing the global memory bank interface from the kernel has a large latency, so global memory transfer should be done in burst. For more information on burst transfers, refer to .
To infer the burst, the following pipelined loop coding style is recommended.
hls::stream<datatype_t> str;
INPUT_READ: for(int i=0; i<INPUT_SIZE; i++) {
#pragma HLS PIPELINE
str.write(inp[i]); // Reading from Input interface
}
In the code example, a pipelined for
loop is used to read data from the input memory interface, and writes to an internal
hls::stream
variable. The above coding style reads
from the global memory bank in burst.
It is a recommended coding style to implement the for
loop operation in the example above inside a separate function, and
apply the dataflow
optimization, as discussed in Dataflow Optimization. The code example below shows how this
would look, letting the compiler establish dataflow between the read, execute, and write
functions:
top_function(datatype_t * m_in, // Memory data Input
datatype_t * m_out, // Memory data Output
int inp1, // Other Input
int inp2) { // Other Input
#pragma HLS DATAFLOW
hls::stream<datatype_t> in_var1; // Internal stream to transfer
hls::stream<datatype_t> out_var1; // data through the dataflow region
read_function(m_in, inp1, in_var1); // Read function contains pipelined for loop
// to infer burst
execute_function(in_var1, out_var1, inp1, inp2); // Core compute function
write_function(out_var1, m_out); // Write function contains pipelined for loop
// to infer burst
}