Assuming masters and slaves are capable of issuing and accepting multiple transactions, there is a relationship between throughput, latency, and outstanding transactions (OTs).
Consider the following example. Suppose you wish to achieve maximum throughput in a single-clock-domain system, using single-cycle requests. You need to issue a single-cycle request every cycle. Since response latency is non-zero, you must have multiple OTs. In fact, the number of OTs required is a function of the slave’s latency. If the slave’s latency is less than or equal to the master’s maximum OTs, full throughput is possible. However, if the slave’s latency is longer, the master will reach maximum OT and stop issuing requests, and it will be able to issue new requests only in exchange for arriving responses. In such a case, the request ‘pipeline’ has gaps and the achieved bandwidth is lower than the maximum by an amount proportional to the slave’s latency.
- NMU
- 64 outstanding writes (up to 256 bytes), and up to 64 outstanding reads of 32 bytes each. The read reorder buffer, RROB, holds 64 32-byte entries. A read that is >32 bytes consumes multiple of these entries.
- NSU
- 32 outstanding writes and 32 outstanding reads.
- DDRMC
- 32 outstanding transactions per channel, where each transaction is a burst length n read or write where n=8 for DDR4, and 16 for LPDDR4.