AI Engine to PL Interface Performance - 2024.2 English

AI Engine Kernel and Graph Programming Guide (UG1079)

Document ID
UG1079
Release Date
2024-11-28
Version
2024.2 English

Versal AI Core series devices include an AI Engine array with the following column categories.

PL column
provides PL stream access. Each column supports eight 64-bit slave channels for streaming data into the AI Engine and six 64-bit master channels for streaming data to the PL.
NoC column
provides connectivity between the AI Engine array and the NoC. These interfaces can also connect to the PL.

To instruct the AI Engine compiler to select higher frequency interfaces, use the --pl-freq=<number> to specify the clock frequency (in MHz) for the PL kernels. The default value is one quarter of the AI Engine frequency and the maximum supported value is a half of the AI Engine frequency, the values depending on the speed grade. Following are examples:

  • Option to enable an AI Engine to PL frequency of 300 MHz for all AI Engine to PL interfaces:
    --pl-freq=300
  • To set a different frequency for a specific PLIO interface use the following code to set it in the ADF graph.
    adf::PLIO *<input>= new adf::PLIO(<logical_name>, <plio_width>, <file>, <FreqMHz>);
Note: The following information applies to the AI Engine device architecture documented in Versal Adaptive SoC AI Engine Architecture Manual (AM009).

The AI Engine to PL AXI4-Stream channels use boundary logic interface (BLI) connections that include optional BLI registers with the exception of the slave channels 3 and 7. The two slave channels channel 3 and channel 7 are slower interfaces. The performance of the data transfer between the AI Engine and PL depends on whether the optional BLI registers are enabled or not.

For less timing-critical designs, all eight channels can be used without using the BLI registers. PL timing can still be met in this case. However, for higher frequency designs, only the six fast channels (0,1,2,4,5,6) can be used and the timing paths from the PL must be registered, using the BLI registers.

To control the use of BLI registers across the AI Engine to PL channels, use the --pl-register-threshold=<number> compiler option, specified in MHz. The default value is 1/8 of the AI Engine frequency based on speed grade. Following is an example:

  • –pl-register-threshold=125

    The compiler will map any PLIO interface with an AI Engine to PL frequency higher than this setting (125 MHz in this case) to high-speed channels with the BLI registers enabled. If the PLIO interface frequency is not higher than the pl-register-threshold value then any of the AI Engine to PL channels will be used.

In summary, if pl-freq < pl-register-threshold all eight channels can be used unregistered. If pl-freq > pl-register-threshold only the six fast channels can be used, with registering. pl-register-threshold is a way to control the threshold frequency beyond which only fast channels can be used (with registering).