xcl_array_partition - 2020.2 English

Vitis Unified Software Platform Documentation: Application Acceleration Development (UG1393)

Document ID
UG1393
Release Date
2021-03-22
Version
2020.2 English

Description

Important: Array variables only accept one attribute. While XCL_ARRAY_PARTITION does support multi-dimensional arrays, you can only reshape one dimension of the array with a single attribute.

An advantage of using the FPGA over other compute devices for OpenCL programs is the ability for the application programmer to customize the memory architecture all throughout the system and into the compute unit. By default, the Vitis compiler generates a memory architecture within the compute unit that maximizes local and private memory bandwidth based on static code analysis of the kernel code. Further optimization of these memories is possible based on attributes in the kernel source code, which can be used to specify physical layouts and implementations of local and private memories. The attribute in the Vitis compiler to control the physical layout of memories in a compute unit is array_partition.

For one-dimensional arrays, the XCL_ARRAY_PARTITION attribute implements an array declared within kernel code as multiple physical memories instead of a single physical memory. The selection of which partitioning scheme to use depends on the specific application and its performance goals. The array partitioning schemes available in the Vitis compiler are cyclic, block, and complete.

Syntax

Place the attribute with the definition of the array variable.

__attribute__((xcl_array_partition(<type>, <factor>, 
<dimension>)))

Where:

  • <type>: Specifies one of the following partition types:
    • cyclic: Cyclic partitioning is the implementation of an array as a set of smaller physical memories that can be accessed simultaneously by the logic in the compute unit. The array is partitioned cyclically by putting one element into each memory before coming back to the first memory to repeat the cycle until the array is fully partitioned.
    • block: Block partitioning is the physical implementation of an array as a set of smaller memories that can be accessed simultaneously by the logic inside the compute unit. In this case, each memory block is filled with elements from the array before moving on to the next memory.
    • complete: Complete partitioning decomposes the array into individual elements. For a one-dimensional array, this corresponds to resolving a memory into individual registers. The default <type> is complete.
  • <factor>: For cyclic type partitioning, the <factor> specifies how many physical memories to partition the original array into in the kernel code. For block type partitioning, the <factor> specifies the number of elements from the original array to store in each physical memory.
    Important: For complete type partitioning, the <factor>> is not specified.
  • <dimension>: Specifies which array dimension to partition. Specified as an integer from 1 to <N>. Vitis core development kit supports arrays of N dimensions and can partition the array on any single dimension.

Example 1

For example, consider the following array declaration.

int buffer[16];

The integer array, named buffer, stores 16 values that are 32-bits wide each. Cyclic partitioning can be applied to this array with the following declaration.

int buffer[16] __attribute__((xcl_array_partition(cyclic,4,1)));

In this example, the cyclic <partition_type> attribute tells the Vitis compiler to distribute the contents of the array among four physical memories. This attribute increases the immediate memory bandwidth for operations accessing the array buffer by a factor of four.

All arrays inside a compute unit in the context of the Vitis core development kit are capable of sustaining a maximum of two concurrent accesses. By dividing the original array in the code into four physical memories, the resulting compute unit can sustain a maximum of eight concurrent accesses to the array buffer.

Example 2

Using the same integer array as found in Example 1, block partitioning can be applied to the array with the following declaration.

int buffer[16] __attribute__((xcl_array_partition(block,4,1)));

Because the size of the block is four, the Vitis compiler will generate four physical memories, sequentially filling each memory with data from the array.

Example 3

Using the same integer array as found in Example 1, complete partitioning can be applied to the array with the following declaration.

int buffer[16] __attribute__((xcl_array_partition(complete, 1)));

In this example, the array is completely partitioned into distributed RAM, or 16 independent registers in the programmable logic of the kernel. Because complete is the default, the same effect can also be accomplished with the following declaration.

int buffer[16] __attribute__((xcl_array_partition));

While this creates an implementation with the highest possible memory bandwidth, it is not suited to all applications. The way in which data is accessed by the kernel code through either constant or data dependent indexes affects the amount of supporting logic that the Vitis compiler has to build around each register to ensure functional equivalence with the usage in the original code. As a general best practice guideline for the Vitis core development kit, the complete partitioning attribute is best suited for arrays in which at least one dimension of the array is accessed through the use of constant indexes.