Bump-in-the-Wire (BITW)

Flow Offload Reference Pipeline User Guide (UG1670)

Document ID
UG1670
Release Date
2024-02-15
Revision
1.1 English

Bump-in-the-wire (BITW) as shown in the following figure is one of the most common use cases for the AMD Pensando™ DPU. Packets are received from the network, desired transformations and policies are applied, and the packets are sent back out to the network. In this case, the host ( PCIe® ) interface is not used. Packets received from the network enter on one of the Ethernet MAC interfaces and are placed in the packet buffer (PB) according to Layer-2 (L2) or Layer-3 (L3) class of service (COS). The packet buffer provides pause absorption buffering and COS aware arbitration for the P4 Ingress pipeline.

Figure 1. BITW

A summary follows. For additional details, refer to the DPU theory of operations document.

Following the P4 program, packet headers pass through the P4 ingress pipeline for flow classification, firewall, tunnel endpoint processing, IPsec via the inline crypto block, and other ingress services. The packet headers next return to the PB and either leave the device through an Ethernet MAC port, or enter the P4 Egress pipeline if further processing is required. Examples of further processing include replication, or applying additional functions such as network address translation (NAT), policers, and telemetry. After all functions are applied, the packets are sent out from the PB through one of the Ethernet MAC ports.

In this scenario, the packet comes into PB on the first Ethernet port. The PB adds the intrinsic global header and sends the packet to the P4 Ingress pipeline:

[22-10-14 20:27:04] P4 :: PBC-MODEL: RECEIVED PACKET ON PORT 0
[22-10-14 20:27:04] P4 :: PBC-MODEL: SENDING PACKET ON PORT 7

The first block in the P4 Ingress pipeline is the parser. This extracts fields from the packet header according to the P4 parser program, and fills them into the Packet Header Vector (PHV). It also adds the P4 intrinsic header to the PHV. A typical parser state transition is shown below.

[22-10-14 20:27:04] P4 :: P4IG_PPA: STATE TRANSITIONS  
[22-10-14 20:27:04] P4 ::                 start/1 
[22-10-14 20:27:04] P4 ::                 parse_uplink/2 
[22-10-14 20:27:04] P4 ::                 parse_ingress_packet/6 
[22-10-14 20:27:04] P4 ::                 parse_ipv4_len_chk_1/20 
[22-10-14 20:27:04] P4 ::                 parse_ipv4_1/21 
[22-10-14 20:27:04] P4 ::                 parse_ipv4_base_1/24 
[22-10-14 20:27:04] P4 ::                 parse_ipv4_checksum_1/25 
[22-10-14 20:27:04] P4 ::                 parse_udp_1/26 
[22-10-14 20:27:04] P4 ::                 accept/52

The parser next sends the PHV to the first P4 stage of the pipeline:

[22-10-14 20:27:04] P4 :: P4IG_STG0: PHV DECODE  
[22-10-14 20:27:04] P4 ::      __capri_intrinsic.tm_iport [511:508] = 0x0 
[22-10-14 20:27:04] P4 ::      intr_global.tm_iport [511:508] = 0x0 
[22-10-14 20:27:04] P4 ::      __capri_intrinsic.tm_oport [507:504] = 0x0 
[22-10-14 20:27:04] P4 ::      intr_global.tm_oport [507:504] = 0x0 
[22-10-14 20:27:04] P4 ::      __capri_intrinsic.tm_iq [503:499] = 0x0 
[22-10-14 20:27:04] P4 ::      intr_global.tm_iq [503:499] = 0x0 
[22-10-14 20:27:04] P4 ::      __capri_intrinsic.lif [498:488] = 0x1 
[22-10-14 20:27:04] P4 ::      intr_global.lif [498:488] = 0x1 
…

At each stage, in addition to the decoded PHV fields, the ASIC model prints out other information helpful for debugging such as which tables are launched in each stage and what fields are used as part of the keys that are looked up. In the example below, a hash table with an overflow TCAM is launched; the key is 25 bits and located in the leftmost part of the key register (big-endian convention):

[22-10-20 08:00:42] P4 :: P4IG_STG2_TE: LAUNCH TABLE ID 1 : vlan_mapping : Hash_OTcam :  K=[0:24] 
[22-10-20 08:00:42] P4 ::         FULL_KEY:0: key_maker hardware id=0 : key_maker profile=0 
[22-10-20 08:00:42] P4 ::                BYTE:  0     FIELD: metadata.classic_nic.vlan_id[4:11](K) = 0x0 
[22-10-20 08:00:42] P4 ::                BYTE:  0     FIELD: phv_byte(122) = 0x0 
[22-10-20 08:00:42] P4 ::                BYTE:  1     FIELD: metadata.classic_nic.l2seg[4:11](K) = 0x1 
[22-10-20 08:00:42] P4 ::                BYTE:  1     FIELD: phv_byte(185) = 0x1 
[22-10-20 08:00:42] P4 ::                BYTE:  2 : BIT_EXTRACTION_0 : BIT 7     FIELD: metadata.classic_nic.vlan_id[0:0](K) = 0x0 
[22-10-20 08:00:42] P4 ::                BYTE:  2 : BIT_EXTRACTION_0 : BIT 7     FIELD: phv_bit(972) = 0x0 
[22-10-20 08:00:42] P4 ::                BYTE:  2 : BIT_EXTRACTION_0 : BIT 6     FIELD: metadata.classic_nic.vlan_id[1:1](K) = 0x0 
…

Once the TE block completes the table entry read, it picks one of the available MPUs to execute the corresponding action and provides it with the following:

  • The data read from the table
  • The data extracted from the PHV
  • Whether the table has been hit
  • Whether there was an error reading the table
  • Other information needed to execute the action

The ASIC model includes an Instruction Set Simulator (ISS) for the MPU instruction set which displays each executed instruction and its result:

[  1]: 10963a880: 51000180018002c0       bcf           [c2 | c3], 0x10963a940 
[  2]: 10963a888: 7d1a1e0006000020   D   seq           c3, k[271], 1 
# ALU(0x0, 0x1, 0x0, 0x0) = 0x0 
# c3 <- 0 
[  3]: 10963a890: 5200121800200080       bbne          k[268], 1, 0x10963a8c0 
# ALU(0x1, 0x1, 0x0, 0x0) = 0x0 
[  4]: 10963a898: 1900000000000000   D   nop           
[  5]: 10963a8a0: e000260040000020       phvwri        p[154:152], 0x1 
# phvwr [154:152] <- 0x1

To simplify debugging, the changes to the PHV in each stage are printed after completing execution of all actions in the stage:

[22-10-14 20:27:04] P4 :: P4IG_STG2: PHV DIFF/CHANGE DECODE  
[22-10-14 20:27:04] P4 ::      __capri_intrinsic.tm_iq [503:499] in=0x0 --> out=0x18 
[22-10-14 20:27:04] P4 ::      intr_global.tm_iq [503:499] in=0x0 --> out=0x18 
[22-10-14 20:27:04] P4 ::      __capri_p4_intrinsic.packet_len [349:336] in=0x0 --> out=0x43 
[22-10-14 20:27:04] P4 ::      intr_p4.packet_len [349:336] in=0x0 --> out=0x43 
[22-10-14 20:27:04] P4 ::      __user_metadata.control_metadata.flow_miss [335:335] in=0x0 --> out=0x1 
[22-10-14 20:27:04] P4 ::      metadata.control_metadata.flow_miss [335:335] in=0x0 --> out=0x1