CTPIO Transmit Recovery - UG1586

Onload User Guide (UG1586)

Document ID
UG1586
Release Date
2026-01-22
Revision
1.4 English

The X4 Express datapath uses a new performance optimized CTPIO implementation with cut-through on PCIe, which provides further reductions to send latency. This means that if a PCIe error occurs during a CTPIO send the host software must handle it, as detailed below.

The PCIe Gen5.0 specification states a maximum bit error rate (BER) of 1E-12. AMD has not seen evidence of any PCIe errors occurring with X4 adapters during extensive testing over multiple months on multiple different OEM production servers.

If a PCIe error occurs in the middle of a CTPIO send, the adapter sends an EF_EVENT_TYPE_TX_ERROR to the host software.

Onload detects this situation. On detection it increments the tx_error_events counter in onload_stackdump lots. It also logs this condition with a TX_ERROR like:

oo:udpsend[65889]: netif: [0] intf 0 TX_ERROR 7 [ev:3]

By default Onload automatically recovers by restarting the TXQ. Packets which are sent during the recovery period are dropped. To disable automatic recovery for a stack, configure EF_ENABLE_TX_ERROR_RECOVERY=0.

TCP normally retransmits any outgoing packets which have not been acknowledged. If TCP retransmission is undesirable for this use case, then configure EF_TCP_RST_DELAYED_CONN=1 so that if a TCP Retransmit Time Out occurs the TCP connection is reset instead of resending the packet.