Arrays can be partitioned into blocks or into their individual elements. In some cases, Vitis HLS partitions arrays into individual elements. This is controllable using the configuration settings for auto-partitioning. When an array is partitioned into multiple blocks, the single array is implemented as multiple RTL RAM blocks. When partitioned into elements, each element is implemented as a register in the RTL. In both cases, partitioning allows more elements to be accessed in parallel and can help with performance; the design trade-off is between performance and the number of RAMs or registers required to achieve it.
A common issue when pipelining functions is the following message:
INFO: [SCHED 204-61] Pipelining loop 'SUM_LOOP'.
WARNING: [SCHED 204-69] Unable to schedule 'load' operation ('mem_load_2',
bottleneck.c:62) on array 'mem' due to limited memory ports.
WARNING: [SCHED 204-69] The resource limit of core:RAM:mem:p0 is 1, current
assignments:
WARNING: [SCHED 204-69] 'load' operation ('mem_load', bottleneck.c:62) on array
'mem',
WARNING: [SCHED 204-69] The resource limit of core:RAM:mem:p1 is 1, current
assignments:
WARNING: [SCHED 204-69] 'load' operation ('mem_load_1', bottleneck.c:62) on array
'mem',
INFO: [SCHED 204-61] Pipelining result: Target II: 1, Final II: 2, Depth: 3.
In this example, Vitis HLS states it cannot
reach the specified initiation interval (II) of 1 because it cannot schedule a load
(read) operation (mem_load_2
)
onto the memory because of limited memory ports. The above message notes that the resource
limit for "core:RAM:mem:p0 is 1
" which is used by the
operation mem_load
on line 62. The second port of the block
RAM also only has 1 resource, which is also used by operation mem_load_1
. Due to this memory port contention, Vitis HLS reports a final II of 2 instead of the desired 1.
This issue is typically caused by arrays. Arrays that are not interfaces to the top-level function are implemented as block RAM which has a maximum of two data ports. This can limit the throughput of a read/write (or load/store) intensive algorithm. The bandwidth can be improved by splitting the array (a single block RAM resource) into multiple smaller arrays (multiple block RAMs), effectively increasing the number of ports.
Arrays are partitioned using the ARRAY_PARTITION directive. Vitis HLS provides three types of array partitioning, as shown in the following figure. The three styles of partitioning are:
-
block
- The original array is split into equally sized blocks of consecutive elements of the original array.
-
cyclic
- The original array is split into equally sized blocks interleaving the elements of the original array.
-
complete
- The default operation is to split the array into its individual elements. This corresponds to resolving a memory into registers.
For block
and cyclic
partitioning the factor
option specifies the number of arrays that are created. In
the preceding figure, a factor of 2 is used, that is, the array is divided into two smaller
arrays. If the number of elements in the array is not an integer multiple of the factor, the
final array has fewer elements.
When partitioning multi-dimensional arrays, the dimension
option is used to specify which dimension is partitioned. The following
figure shows how the dimension
option is used
to partition the following example code:
void foo (...) {
int my_array[10][6][4];
...
}
The examples in the figure demonstrate how partitioning
dimension
3 results in 4 separate arrays and partitioning
dimension
1 results in 10 separate arrays. If zero is specified as the
dimension
, all dimensions are partitioned.