On the other hand, for a stream-of-blocks the communication between the producer and the consumer is modeled as a stream of array-like objects, providing several advantages over array transfer through PIPO.
The use of stream-of-blocks in your code requires the following include file:
#include "hls_streamofblocks.h"
The stream-of-blocks object template is: hls::stream_of_blocks<block_type, depth> v
Where:
-
<block_type>
specifies the datatype of the array or multidimensional array held by the stream-of-blocks -
<depth>
is an optional argument that provides depth control just likehls::stream
or PIPOs, and specifies the total number of blocks, including the one acquired by the producer and the one acquired by the consumer at any given time. The default value is 2 -
v
specifies the variable name for the stream-of-blocks object
Use the following steps to access a block in a stream of blocks:
- The producer or consumer process that wants to access the stream
first needs to acquire access to it, using a
hls::write_lock
orhls::read_lock
object. -
After the producer has acquired the lock it can start writing (or reading) the acquired block. Once the block has been fully initialized, it can be released by the producer, when the
write_lock
object goes out of scope.Note: The producer process with a
write_lock
can also read the block as long as it only reads from already written locations, because the newly acquired buffer must be assumed to contain uninitialized data. The ability to write and read the block is unique to the producer process, and is not supported for the consumer. - Then the block is queued in the stream-of-blocks in a FIFO fashion,
and when the consumer acquires a
read_lock
object, the block can be read by the consumer process.
The main difference between hls::stream_of_blocks
and the PIPO mechanism seen in the prior examples is
that the block becomes available to the consumer as soon as the write_lock
goes out of scope, rather than only at the return of the
producer process. Hence the size of storage required to manage the original example
(without the dataflow loop) is much less with stream-of-blocks than with just PIPOs:
namely 2N instead of 2xMxN in the example.
Rewriting the prior example to use hls::stream_of_blocks
is shown in the example below. The producer acquires
the block by constructing an hls::write_lock
object
called b
, and passing it the reference to the
stream-of-blocks object, called s
. The write_lock
object provides an overloaded array access
operator, letting it be accessed as an array to access underlying storage in random
order as shown in the example below.
The acquisition of the lock is performed by constructing the write_lock
/read_lock
object, and the release occurs automatically when that object is destructed as it goes
out of scope. This approach uses the common Resource Acquisition Is
Initialization (RAII) style of locking and unlocking.
#include "hls_streamofblocks.h"
typedef int buf[N];
void producer (hls::stream_of_blocks<buf> &s, ...) {
for (int i = 0; i < M; i++) {
// Allocation of hls::write_lock acquires the block for the producer
hls::write_lock<buf> b(s);
for (int j = 0; j < N; j++)
b[f(j)] = ...;
// Deallocation of hls::write_lock releases the block for the consumer
}
}
void consumer(hls::stream_of_blocks<buf> &s, ...) {
for (int i = 0; i < M; i++) {
// Allocation of hls::read_lock acquires the block for the consumer
hls::read_lock<buf> b(s);
for (int j = 0; j < N; j++)
... = b[g(j)] ...;
// Deallocation of hls::write_lock releases the block to be reused by the producer
}
}
void top(...) {
#pragma HLS dataflow
hls::stream_of_blocks<buf> s;
producer(b, ...);
consumer(b, ...);
}
The key features of this approach include:
- The expected performance of the outer loop in the producer above is to achieve an overall Initiation Interval (II) of 1
- A locked block can be used as though it were private to the producer or the consumer process until it is released.
- The initial state of the array object for the producer is undefined, whereas it contains the values written by the producer for the consumer.
- The principal advantage of stream-of-blocks is to provide overlapped execution of multiple iterations of the consumer and the producer to increase throughput.