Memory Port Contention - 2024.1 English

Vitis HLS Messaging (UG1448)

Document ID
UG1448
Release Date
2024-05-30
Version
2024.1 English

Description

Some variables are accessed by more instructions than their hardware implementation can sustain in a single cycle, preventing some loops from being accelerated. Partition these variables to accelerate your design.

Explanation

The parallelism in the loop is limited by the number of memory ports available. Higher performance can be reached if more memory ports are made available.

When dependencies allow it, memory accesses are tentatively scheduled in parallel (at the same clock cycle). However, in order for all the memory accesses to execute at the same time, the memory they access must have at least one port available for each access.

The number of ports of a memory can be indirectly increased using memory partitioning, typically using the array_partition pragma. In some cases, the bind_storage pragma can also be used to control the amount of ports available on the memory that stores a particular variable.

In the following example, all writes to A can be performed in parallel only if at least four ports are available for this variable. The default implementations of BRAM-backed memory tend to have one or two ports only, preventing the parallel execution of all the memory accesses. In the example below, partitioning the array A with a factor of 2 solves the contention and accelerates the loop.

for (int i = 0; i < 16; i += 4) {
  A[i] = ...
  A[i + 1] = ...
  A[i + 2] = ...
  A[i + 3] = ...
}

Recommendation

Partition the variables to accelerate your design.