If a host machine has only one accelerator card installed, VSC will try to use that card. There will be a fatal error if the card does not match the platform that the design was compiled for. However, on a host machine which has multiple cards installed, VSC will by default pick the first card that exactly matches the platform the design was compiled for. The host code can override the default in two ways:
- Setting environment variable
XILINX_SC_CARD
to the desired<cardIndex>
. - From the host code, call the following API before making any other
calls.
VPP_ACC call: my_acc::add_card(<cardIndex>)
xilinx_u2_gen3x4_xdma_gc_base_2
will not match a design compiled for
xilinx_u2_gen3x4_xdma_gc_2_202110_1
, even though
the platforms are compatible.As shown in the sysc_multi_card
example
in Supported Platforms and Startup Examples, if the host has identical
accelerator cards installed you can use multiple cards to run your VSC accelerator. This
is supported in a mode where all CUs of any given card are running as a separate compute
cluster as explained below.
The separate compute cluster mode is useful for performance improvement in
scenarios like the U2 card with a local smartSSD. This example code show below creates
CU-clusters and assigns a card to each of them. Then, the user code can perform data
selection based on the index-i to ensure that the subsequent compute()
job will automatically use the card-i because the selected
data-i resides on the same SSD.
VPP_CC* cuCluster = new VPP_CC[ncards];
for (int i = 0; i < ncards; ++i) {
my_acc::add_card(cuCluster[i], i);
}
for (int i = 0; i < ncards; ++i) {
my_acc::send_while(
[=]() -> bool {
... // data-i selection
}
, cuCluster[i]);
my_acc::receive_all_in_order(
[=]() {
...
}
, cuCluster[i]);
}
for (int i = 0; i < ncards; ++i) {
my_acc::join(cuCluster[i]);
}