Receive Flow Steering (RFS) - UG1739

AMD Solarflare X4 Series Ethernet Adapter User Guide (UG1739)

Document ID
UG1739
Release Date
2025-10-24
Revision
1.0 English

RFS attempts to steer packets to the core where a receiving application is running. This reduces the need to move data between processor caches, and can significantly reduce latency and jitter. Modern NUMA systems, in particular, can benefit substantially from RFS where packets are delivered into memory local to the receiving thread.

Unlike RSS which selects a CPU from a CPU affinity mask set by an administrator or user, RFS stores the CPU core identifier of the application when its process calls recvmsg() or sendmsg().

  • A hash is calculated from a packet’s addresses or ports (2-tuple or 4-tuple) and serves as the consistent hash for the flow associated with the packet.
  • Each receive queue has an associated list of CPUs to which RFS can enqueue the received packets for processing.
  • For each received packet, an index into the CPU list is computed from the flow hash modulo the size of the CPU list.

There are two types of RFS implementation:

  • Soft RFS.

    Soft RFS is a software feature supported since Linux 2.6.35 that attempts to schedule protocol processing of incoming packets on the same processor as the user thread that consumes the packets.

  • Hardware (or Accelerated) RFS.

    Accelerated RFS is supported since Linux kernel version 2.6.39.

RFS can dynamically change which CPUs can be assigned to a packet or packet stream, and this introduces the possibility of out of order packets. To prevent out of order data, two tables are created that hold state information used in the CPU selection:

  • Global_flow_table: Identifies the number of simultaneous flows that are managed by RFS.
  • Per_queue_table: Identifies the number of flows that can be steered to a queue. This holds state as to when a packet was last received.

The tables support the steering of incoming packets from the network adapter to a receive queue affinitized to a CPU where the application is waiting to receive them. The AMD Solarflare accelerated RFS implementation requires configuration through the two tables and the ethtool -K command.

The following sub-sections identify the RFS configuration procedures:

Kernel Configuration

Before using RFS the kernel must be compiled with the kconfig symbol CONFIG_RPS enabled. Accelerated RFS is only available if the kernel is compiled with the kconfig symbol CONFIG_RFS_ACCEL enabled.

Global Flow Count

Configure the number of simultaneous flows that will be managed by RFS. The suggested flow count depends on the expected number of active connections at any given time and can be less than the number of open connections. The value is rounded up to the nearest power of two.

# echo 32768 > /proc/sys/net/core/rps_sock_flow_entries

Per Queue Flow Count

Each adapter interface has a ‘queue’ directory containing one ‘rx’ or ‘tx’ subdirectory for each queue associated with the interface. For RFS only the receive queues are relevant.

# cd /sys/class/net/eth3/queue

Within each ‘rx’ subdirectory, the rps_flow_cnt file holds the number of entries in the per- queue flow table. If only a single queue is used then rps_flow_cnt is the same as rps_sock_flow_entries. When multiple queues are configured the count is equal to rps_sock_flow_entries/N where N is the number of queues.

For example if rps_sock_flow_entries = 32768 and there are 16 queues then you must configurerps_flow_cnt for each queue as 2048:

# echo 2048 > /sys/class/net/eth3/queues/rx-0/rps_flow_cnt
# echo 2048 > /sys/class/net/eth3/queues/rx-1/rps_flow_cnt

Disable RFS

To turn off RFS using the following command:

# ethtool -K <devname> ntuple off