WYSIWYG means “what you see is what you get.”
The dataflow pragma can be applied in two different contexts:
- Inside a function (also known as function dataflow region).
- Inside a special kind of for loop (also known as loop dataflow region) such
that:
- The loop is the only statement inside the body of a function.
- Variable declarations outside the loop are forbidden.
- The loop counter is of int type.
- The initial value is set to any non-negative integer constant in the loop header.
- The exit condition is an "<" comparison with a non-negative numerical constant or a scalar argument of the function that encloses the loop.
- The loop variable is incremented by any positive integer constant.
- The loop is the only statement inside the body of a function.
The WYSIWYG coding style is that the body of that function or for loop must contain only a sequence of:
- Local variable declarations without initialization, including those performed
automatically by constructors.
- These variables, also called “channels”, must be initialized in the process that produces them.
- For standard data types that have default constructors that cannot be
redefined (for example:
std::complex
), one can avoid initialization by using the no_ctor attribute.- For example:
std::complex<float> arr[SIZE] __attribute__((no_ctor));
- For example:
- sub-function calls (that use the ap_ctrl_chain protocol) and/or.
- hls::task instantiations (that use the ap_ctrl_none protocol).
For example, there cannot be any control inside a “canonical” dataflow region (if-then-elses and loops would be automatically converted into processes, which results in the unpredictability of dataflow structure discussed above).
These called sub-functions and hls::tasks, also called processes:
- Can be:
- Sequential functions or pipelined functions, or
- Function dataflow regions, or
- Loop dataflow regions.
- Only variables can be passed as arguments to processes, without involving automatic type conversions.
Example of recommended style for a dataflow function:
void dataflow(int Input0, int Input1[], int &C0, int C1[]) { ... }
#pragma HLS dataflow
int C1[N], C2; // no initialization
UserDataType C0 __attribute__((no_ctor)); // no_ctor must be used if the default constructor is not empty
func1(Input0, Input1, C0, C1); // read Input0, read Input1, write C0, write C1
func2(C0, C1, C2); // read C0, read C1, write C2
func3(C2, Output0, Output1); // read C2, write Output0, write Output1
}
Example of recommended style for dataflow in loop:
void dataflow(int Input0, int Input1[], int &Output0, int Output1[]) {
for (int i = 2; i < N; i+=2) {
#pragma HLS dataflow
int C1[N], C2; // no initialization
UserDataType C0 __attribute__((no_ctor)); // no_ctor must be used if the default constructor is not empty
func1(Input0, Input1, C0, C1, i); // read Input0, read Input1, write C0, write C1
func2(C0, C1, C2, i); // read C0, read C1, write C2
func3(C2, Output0, Output1, i); // read C2, write Output0, write Output1
}
}
Note: The function where the loop occurs does not require the dataflow
pragma, but it must contain only the loop.
Further semantic restrictions about the body of dataflow functions or regions:
- Local variables must be non-static scalars or arrays (static variables are allowed only inside called processes).
- Instances of hls::tasks in a canonical region must be declared as hls_thread_local
(For example,
hls_thread_local hls::task t1(proc, arg1, arg2, arg3);
). - Instances of hls::stream and hls::stream_of_blocks inside an hls::task must be
declared as hls_thread_localNote: hls_thread_local is static, but it is more appropriate in this context, because it does not imply a shared single variable instance among multiple hls::tasks with the same function body.
- The processes must transfer data among them using local variables, also called
“channels”, belonging to these categories:
- Arrays:
- Must have only one writer process.
- If they have multiple reader processes, they must be marked with
#pragma HLS bind_storage variable=... type=1wnr
. - The writer must be lexically before the reader(s).
- Cannot have loop-carried dependences (exceptions discussed below).
- Streams:
hls::streams
andhls::stream_of_blocks
:- Must have only one reading process and one writing process.
- Cannot have loop-carried dependences (exceptions discussed below).
- Scalars:
- Can have multiple writer and reader processes.
- Are automatically converted into FIFO channels, each with one
producer and one consumer. These channels can:
- Either have explicit, automatically generated, write and read operations inside the function bodies (these are called “scalar propagation FIFOs”).
- Or be automatically written by the ap_done of the producer and read by the ap_ready of the consumer (these are called “task level FIFOs”)
- Cannot have loop-carried dependences (exceptions discussed below).
- Arrays:
- Top kernel arguments can also be passed to processes, but have different
restrictions because.
- No communication among processes can occur using a top kernel argument.
- For top array arguments mapped to m_axi interfaces, the writer process (if any) must be after the reader (if any), because Vitis HLS assumes that top m_axi arrays do not have carried dependences, for example each execution of the kernel receives a new m_axi mapped buffer in DRAM.