To maximize performance when using CPM, you must consider the following setup:
- Master AXI4 Ports
- Because there are two AXI4 MM ports, you must balance the
packets accordingly and maximize bus utilization at both ports.
- When to use:
- Calculate your aggregated PCIe link throughput: This is PCIe Link Speed * PCIe Link Width.
- Calculate your AXI4 MM port throughput on one of the port: This is 128-bit * CPMTOPSWCLK frequency (CPM GUI selection. Speed grade dependent, consult your device/silicon datasheet).
- If PCIe link throughput is greater
than AXI4 MM port
throughput, you must use both ports.Note: Take into account the following considerations vs. design complexity in using both ports if your bandwidth is nearly equal.
- PCIe link has some TLP overhead typically ~20-25% depending on packet sizes, Max Payload Size, and Max Read Request Size settings. Unaligned address transfers and/or scattered host memory might also affect this number due to inefficient DMA transfers.
- NoC has some overhead typically ~6% on the Write side due to metadata insertion but nearly optimum on the Read side.
- If using DDR memory, there might be additional overhead depending on the traffic pattern and the DDR bank/column/row settings.
- How to use:
- Packets must not split into both ports. They
must operate independently as much as possible to avoid Head of
Line blocking due to AXI4 ID and PCIe tags ordering.
- QDMA
- Split your traffic based on Queue ID. Allocate some queues to use the first AXI4-MM0 and the rest on the second AXI4-MM1.
- XDMA
- Traffic will be split automatically based on DMA channel ID. Even DMA channels route to AXI4-MM0 and odd DMA channels route to AXI4-MM1.
- AXI4 Bridge
- Only use one port AXI4-MM0. Therefore performance is expected to max out at the AXI4 MM port throughput only, and might not be up to the PCIe link throughput capability.
- Packets must not split into both ports. They
must operate independently as much as possible to avoid Head of
Line blocking due to AXI4 ID and PCIe tags ordering.
- When to use:
- Slave AXI4 MM Port
- Because there is only one AXI4 MM port, performance through this port is expected to max out at AXI4 MM port throughput only and might not be up to the PCIe link throughput capability.
- Master and Slave AXI4-ST Ports
- Because CPM can only use AXI4-ST ports in this mode directly to PL, therefore users are only required to operate their design at the same frequency and data bus width as the AXI4-ST interface from the CPM or PCIe PL IP.
PL PCIe can only use AXI4-ST ports, therefore users are only required to operate their design at the same frequency and data bus width as the AXI4-ST interface from the CPM or PCIe PL IP.