- Back-pressure (refer to Back-Pressure section in this chapter).
- Varying levels of parsing.
- If a packet has a small number of headers to be parsed it can have a lower latency through Vitis Networking P4.
- If a packet has a larger number of headers to be parsed it can have a higher latency through Vitis Networking P4.
An example of how to calculate the number of clock cycles required to flush the Vitis Networking P4 pipeline of all packets is as follows:
m_axis_tready
is held High
to flush out the pipeline and that s_axis_tvalid
is
held Low to avoid new data entering the pipeline. In this case, the calculated
latency value can be treated as the maximum latency, the only exception being any
header insert operations.If the P4 program is inserting new headers into the outgoing packets, some clock cycles need to be added to the calculated latency value. For each header inserted, the header size is divided by the packet bus width and rounded up to the nearest integer. This is Calculation A in the following calculation.
The number of packets that can be in the pipeline at any given time (based on the calculated latency, the minimum packet size, the maximum packet rate, etc.) should also be considered. This is Calculation B in the following calculation.
This number is then multiplied by the extensions to the latency per packet due to header inserts to give the total extensions to the latency due to header inserts. This is Calculation C in the following calculation.
The final result is achieved by adding the calculated latency to the result of Calculation C as shown in Calculation D.
For example:
- Calculated Latency from the Vitis Networking P4 GUI = 20 clock cycles
- Packet Bus Width = 64-bit
- Packet Rate = AXIS Clock Frequency
- Minimum Packet Size = 64 bytes
- P4 program can insert two new headers to the outgoing packets
- Header A = 48-bit
- Header B = 136-bit
Calculation A - Calculate the possible extensions to the latency per packet due to header inserts:
- Header A = ceil(48/64) = 1 clock cycle
- Header B = ceil(136/64) = 3 clock cycles
- Total = 3+1 = 4 clock cycles
Calculation B - Calculate number of packets (worst case) in the pipeline at a given time:
- ceil(calculated latency / (Minimum Packet Size / Packet Bus Width)) = ceil(20 / (64*8/64)) = 3 packets in progress
Calculation C - Total Extensions to the latency due to header inserts = 3*4 = 12 clock cycles
Calculation D - In this example, the maximum time to flush out the pipelines = 20 + 12 = 32 clock cycles