Currently, Xilinx devices on Data Center accelerator cards use stacked silicon consisting of several Super Logic Regions (SLRs) to provide device resources, including global memory. For best performance, when assigning ports to global memory banks, as described in Mapping Kernel Ports to Memory, it is best that the CU instance is assigned to the same SLR as the global memory it is connected to. In this case, you will want to manually assign the kernel instance, or CU into the same SLR as the global memory to ensure the best performance.
A CU can be assigned to an SLR using the connectivity.slr
option in a config file. The syntax of the connectivity.slr
option in the config file is as follows:
[connectivity]
#slr=<compute_unit_name>:<slr_ID>
slr=vadd_1:SLR2
slr=vadd_2:SLR3
Where:
-
<compute_unit_name>
is an instance name of the CU as determined by theconnectivity.nk
option, described in Creating Multiple Instances of a Kernel, or is simply<kernel_name>_1
if multiple CUs are not specified. -
<slr_ID>
is the SLR number to which the CU is assigned, in the form SLR0, SLR1,...
The assignment of a CU to an SLR must be specified for each CU
separately, but is not required. If an assigned CU is connected to global memory located
in another SLR, the tool will automatically insert SLR crossing registers to help with
timing closure. In the absence of an SLR assignment, the v++
linker is free to assign the CU to any SLR.
v++
linking process by specifying
the config file using the --config
option:
v++ -l --config config_slr.cfg ...