Latency as a Function of NMU to DDRMC/BRAM Routing Path - 1.0 English - PG313

Versal Adaptive SoC Programmable Network on Chip and Integrated Memory Controller 1.0 LogiCORE IP Product Guide (PG313)

Document ID
PG313
Release Date
2023-11-01
Version
1.0 English

When assigning a traffic generator to an NMU site, a change in latency can be observed depending on the location of the NMU site in relation to the DDRMC or block RAM. Increased distance will typically result in increased NoC component usage, which will lead to increased latency and, in turn, decreased read bandwidth. Furthermore, in multi-SLR (SSI technology) devices, the change in latency will be more significant when passing between SLRs.

An example of this change in latency and read bandwidth is illustrated in the following figure. The example shows a plot of read bandwidth and latency for a single traffic generator connecting to an LPDDR4 interface through a DDRMC from various NMU sites across the leftmost VNoC (X-coordinate 0) of a multi-SLR (SSI technology) device. The trend shows a linear decrease in read bandwidth as the NMU site Y-coordinate increases, corresponding to the NMU location getting further from the DDRMC. Significant drops in read bandwidth occur between SLR crossings (NMU site Y-coordinates 6 to 7, 12 to 13, and 18 to 19).

Figure 1. Plot of Read Bandwidth and Latency vs. Traffic Generator NMU Site

This change in latency and read bandwidth is caused by signals being routed through NPS blocks along the VNoC and HNoC as well as NIDB blocks connecting the VNoC between SLRs in multi-SLR (SSI technology) devices (see Figure 2). Differences in latency and bandwidth between VNoCs within the same SLR while keeping the DDRMC site static are consistent with routing through additional NPS blocks along the HNoC. This behavior is inherent to Versal devices and should be taken into consideration when designing to maximize NoC performance.