Cyclic Dependence - 2024.1 English

Vitis HLS Messaging (UG1448)

Document ID
Release Date
2024.1 English


A cyclic dependence prevents further acceleration of this piece of code (a process in the graph). This generally requires some algorithmic changes to improve.


A cyclic dependence in the design limits the maximum attainable throughput.

A cyclic dependence exists when the execution of a statement depends on a previous execution of itself, possibly through a chain of data dependencies. A cyclic dependence necessarily involves a loop-carried dependence (LCD) but a loop-carried dependence does not always lead to a cyclic dependence, as illustrated in the following code snippet.

for (int i = 0; i < 10; ++i) {
  res[i] = last;     // LCD on 'last': reading the value written in previous iteration
  last = 2.5f * i;   // no cyclic dependence: the new value of 'last' does not depend its previous value
for (int i = 0; i < 10; ++i) {
  res[i] = last;                // LCD on 'last': reading the value written in previous iteration
  last = (res[i] + 2.5f) * i;   // cyclic dependence: the new value of 'last' depends on its previous value

Cyclic dependencies create additional constraints when scheduling the loop instructions: the instructions involved in cyclic dependencies can only be re-executed after the previous execution of all the instructions in the cycle for the design to be correct.

For example, in the first loop of the previous code snippet, the multiplication can be performed in parallel to the first assignment because there is no cyclic dependence. In the second loop, no overlap is possible because all instructions are part of a cyclic dependence. The scheduling constraint is relaxed when the LCD has a distance greater than one, such as when the same element is accessed in a loop more than one iteration later. Formally, the interval of a loop with a cyclic dependence is at least the cycle duration divided by the dependence distance of the LCD.


There are multiple ways to deal with a cyclic dependence to improve the performance of a design.

It is sometimes possible to reduce the duration of the cycle, performing shorter or less operations within the cycle. For instance, in the presence of memory contention in the cycle, you can increase the array partitioning, which might reduce the duration of the cycle. As another example, you can sometimes turn floating point operations performed in the cycle into fixed point operations, which are generally performed faster.

It is also sometimes possible to eliminate some dependencies (write after write or write after read) by renaming the variables involved or expanding them with additional array dimensions. When you eliminate such dependencies in a cycle, it can break the cycle, avoiding the negative performance effects.

In the more extreme cases, you can rewrite the algorithm to expose more parallelism with fewer dependencies, typically resulting in fewer or shorter cycles.