Data Bandwidth and Performance Tuning - 3.4 English

Versal Adaptive SoC CPM DMA and Bridge Mode for PCI Express Product Guide (PG347)

Document ID
PG347
Release Date
2024-11-22
Version
3.4 English

The CPM offers a few different main data interfaces for you to use depending on the CPM subsystem functional mode being used. The following table shows the available data interfaces to be used as the primary data transfer interface for each functional mode.

Table 1. Available Data Interface for Each CPM Subsystem Functional Mode
Functional Mode CPM_PCIE_NOC_0/1 NOC_CPM_PCIE_0 CPM_PL_AXI_0/1 AXI4 ST C2H/H2C
CPM4 QDMA Yes (both) No N/A Yes
CPM5 QDMA Yes (both) No Yes (both) Yes
CPM4 AXI Bridge Yes (only one) Yes N/A No
CPM5 AXI Bridge Yes (only one) Yes Yes (Only one) No
CPM4 XDMA Yes No N/A Yes
  1. CPM_PCIE_NOC_0/1: These interfaces are for AXI4-MM traffic which is mastered from within the CPM and exits to the NoC towards DDRMC/PL connections. Examples of such masters in the CPM include the CPM integrated DMA and the CPM integrated bridge master.
  2. NOC_CPM_PCIE_0: This interface is for AXI4-MM traffic which is mastered from an internal PS or PL connections and exits from the NoC towards CPM. Examples of such slaves in the CPM include the CPM integrated bridge slave.
  3. CPM_PL_AXI_0/1: These interfaces are for AXI4-MM traffic which is mastered from within the CPM and exits to the PL directly. Examples of such masters in the CPM include the CPM integrated DMA and the CPM integrated bridge master. These interfaces are only available to CPM5 controller and DMA/Bridge instance 1 (not available to instance 0).
  4. AXI4 ST C2H/H2C: This interface is for inbound and outbound AXI4-ST traffic for the CPM integrated DMA.
Note: Certain data interfaces are unavailable based on the selected feature set for that particular functional mode. For more details on these restrictions, refer to the port description in the associated CPM subsystems section.
Note: Some data interfaces are shared with more than one feature set. Therefore, even though a particular mode does not use certain data interfaces with those interfaces can still be enabled and visible at the CPM boundary for other use.

The raw capacity for each AXI4 data interface is determined by multiplying the data width and the clock frequency. The net bandwidth depends on multiple factors, including but not limited to the packet overhead for given packet types. Achievable bandwidth might vary.

  • CPM_PCIE_NOC and NOC_CPM_PCIE: Fixed 128-bit wide at CPM_TOPSW_CLK frequency. The maximum frequency is dependent on the device speed grade.
  • CPM_PL_AXI: 64/128/256/512-bit data width is supported at cpm_pl_axi0_clk or cpm_pl_axi1_clk pin frequency.
  • AXI4 ST C2H/H2C: 64/128/256/512 bit data width is supported at dma_intrfc_clk pin frequency.
    Note: NoC clock frequency must be greater than the CPM_TOPSW_CLK clock frequency.

The raw capacity for the PCIe link is determined by multiplying the number of PCIe lanes (x1/x2/x4/x8/x16) and their link speed (Gen1/Gen2/Gen3/Gen4/Gen5). The overhead of the link comes from the link layer encoding and Ordered Sets, CRC fields, packet framing, TLP headers and prefixes, and data bus alignment.

In the event that a particular PCIe link configuration has a higher bandwidth than the available data bus capacity of the AXI4 interface, more than one AXI4 interface must be used to sustain the maximum link throughput. This can be achieved in various ways. Here are some examples:

  • Load balance data transfer by allocating half of the enabled DMA queues or DMA channels to interface #0, and the other half to interface #1.
  • Share the available PCIe link bandwidth among different types of transfers. DMA streaming uses AXI4 ST C2H/H2C interface while DMA Memory Mapped uses CPM_PCIE_NOC or CPM_PL_AXI interfaces.

AXI Bridge functional mode alone might not be able to sustain full PCIe link bandwidth in some link and device configurations due to the availability of only one data interface per bridge instance. Therefore, the AXI Bridge functional mode is restricted to control and status accesses only, it is not intended to be used as a primary data mover. However, it can be paired with a DMA functional mode to make use of the remaining bandwidth. This functional mode has a variety of applications, including but not limited to root complex (RC) memory bridging and add-in-card peer-to-peer (P2P) operation. P2P use cases are complex with respect to the achievable bandwidth depending on many factors including but not limited to CPM DMA/Bridge bandwidth capabilities, whether DMA or Bridge is active depending on the initiator of the P2P operation, the peer capability, and the capability of any intervening switch component or root complex integrated switch.

You must also analyze the potential of head of line blocking or the request and response buffer size for each interface and ensure that data transfer initiated within a system does not cause cyclic dependencies between interfaces or different transfers. PCIe and AXI specifications have data types, IDs, and request/response ordering requirements and CPM upholds those requirements. For more details on CPM_PCIe_NOC and NOC_CPM_PCIe interfaces, refer to Versal Adaptive SoC Programmable Network on Chip and Integrated Memory Controller LogiCORE IP Product Guide (PG313). CPM_PL_AXI_0/1 and AXI4 ST C2H/H2C interfaces are direct interfaces to the user PL region and give you the flexibility to attach your own data buffer and interconnect as required.