During the Vitis tool flow, the
kernel port to memory bank connectivity can be established using the --connectivity.sp
switch as described in Mapping Kernel Ports to Memory. The xclbin generated by v++
contains the
information about the kernel port to memory connectivity so that XRT can allocate
buffers appropriately. When a buffer is created in the host code, XRT automatically
assigns the buffer to memory from the kernel xclbin, and manages the buffers internally. If a single kernel port is
connected to multiple memory banks, XRT always starts from the lower numbered bank.
In most cases, this approach is sufficient. However, in some specific
cases you may need to manually assign the buffer location (or special property) in the
host code. For this purpose, the AMD
OpenCL vendor extension provides a buffer extension
called CL_MEM_XRT_PTR_XILINX
to specifically manage
bank assignment in the host code. The following code example shows the required header
file and code for assigning input and output buffers to DDR bank 0 and bank 1:
#include <CL/cl_ext.h>
…
int main(int argc, char** argv)
{
…
cl_mem_ext_ptr_t inExt, outExt; // Declaring two extensions for both buffers
inExt.flags = 0|XCL_MEM_TOPOLOGY; // Specify Bank0 Memory for input memory
outExt.flags = 1|XCL_MEM_TOPOLOGY; // Specify Bank1 Memory for output Memory
inExt.obj = 0 ; outExt.obj = 0; // Setting Obj and Param to Zero
inExt.param = 0 ; outExt.param = 0;
int err;
//Allocate Buffer in Bank0 of Global Memory for Input Image using Xilinx Extension
cl_mem buffer_inImage = clCreateBuffer(world.context, CL_MEM_READ_ONLY | CL_MEM_EXT_PTR_XILINX,
image_size_bytes, &inExt, &err);
if (err != CL_SUCCESS){
std::cout << "Error: Failed to allocate device Memory" << std::endl;
return EXIT_FAILURE;
}
//Allocate Buffer in Bank1 of Global Memory for Input Image using Xilinx Extension
cl_mem buffer_outImage = clCreateBuffer(world.context, CL_MEM_WRITE_ONLY | CL_MEM_EXT_PTR_XILINX,
image_size_bytes, &outExt, NULL);
if (err != CL_SUCCESS){
std::cout << "Error: Failed to allocate device Memory" << std::endl;
return EXIT_FAILURE;
}
…
}
The extension pointer cl_mem_ext_ptr_t
is
a struct
as defined below:
typedef struct{
unsigned flags;
void *obj;
void *param;
} cl_mem_ext_ptr_t;
- Valid values for
flags
are:- XCL_MEM_DDR_BANK0
- XCL_MEM_DDR_BANK1
- XCL_MEM_DDR_BANK2
- XCL_MEM_DDR_BANK3
- <id> | XCL_MEM_TOPOLOGYNote: The <id> is determined by looking at the Memory Configuration section in the xxx.xclbin.info file generated next to the xxx.xclbin file. In the xxx.xclbin.info file, the global memory (DDR, HBM, PLRAM, etc.) is listed with an index representing the <id>.
-
obj
is the pointer to the associated host memory allocated for the CL memory buffer only ifCL_MEM_USE_HOST_PTR
flag is passed toclCreateBuffer
API, otherwise set it to NULL. -
param
is reserved for future use. Always assign it to 0 or NULL.
Here are some specific cases where you might want to use the extension pointer:
- P2P Buffer
- For an explanation and example, refer to https://xilinx.github.io/XRT/master/html/p2p.html.
- Host-Memory Buffer
- For an explanation and example, refer to https://xilinx.github.io/XRT/master/html/hm.html.
- Allocating the host buffer to a specific bank when the kernel port is connected to multiple banks
- For example, DDR[0:1]. This use case is described in detail in the Using Multiple DDR Banks lab of the Vitis Optimizing Accelerated FPGA Applications: Bloom Filter Example tutorial.
Example of Allocating the Host Buffer to a Specific Bank
An example of the third case listed above, where you might need to
use cl_mem_ext_ptr_t
, is when the host and kernel
are both accessing the DDR bank simultaneously, and you would like to split the data
so that kernel and host access memory banks in a ping-pong fashion. When the host is
writing/reading to a specific memory bank, the kernel is writing/reading from
another bank so that these host/kernel accesses don't compete and impact
performance. For this scenario, you must manage the buffer allocation yourself.
The kernel ports in the xclbin
are connected to DDR bank1 and bank2, and reading the data from these banks
alternatively. The connectivity is established during linking by the Vitis compiler using the --connectivity.sp
switch:
[connectivity]
sp=runOnfpga_1.input_words:DDR[1:2]
From the host code, you can send the input_words
data to DDR banks 1 and 2 alternatively. Two AMD extension pointer (cl_mem_ext_ptr_t
) objects are created as shown in the example code
below. The object flags will determine which DDR bank each buffer will be assigned
to for the kernel to access. The kernel argument can be set to input_words[0]
and input_words[1]
for consecutive kernel enqueues.
#include <CL/cl_ext.h>
…
int main(int argc, char** argv)
{
cl_mem_ext_ptr_t buffer_words_ext[2];
buffer_words_ext[0].flags = 1 | XCL_MEM_TOPOLOGY; // DDR[1]
buffer_words_ext[0].param = 0;
buffer_words_ext[0].obj = input_doc_words;
buffer_words_ext[1].flags = 2 | XCL_MEM_TOPOLOGY; // DDR[2]
buffer_words_ext[1].param = 0;
buffer_words_ext[1].obj = input_doc_words;
…