HBM2e access latency is affected by several factors:
- Structural Latency
- This is due to the physical route taken to progress from the source to destination, the number of switches traversed, etc. In routing cases, such as PTPG, multiple switches are traversed to reach the desired destination. Each switch adds additional latency, and thus limiting switch hops should be considered to maximize the performance.
- Queuing Latency
- This is caused by the queued requests. The queued requests need to be resolved before a new request can be serviced. There are multiple storage elements to queue the transactions in the NMUs, switches and the HBM controllers. If masters on the network are issuing multiple outstanding transactions that fill up these queues, the transaction latency will increase accordingly. In an oversubscribed use case, queuing latency will have a much larger effect on the overall latency than the structural latency caused by the physical route through the NoC.
- DRAM Congestion
- The latency within the DRAM controller depends on the DRAM state, such as if the page being accessed is open or closed. Refreshes, page misses, etc. are other factors that can affect access latency.