IPsec_GW Reference Pipeline User Guide (UG1671)

Document ID
Release Date
1.1 English

This guide presents an overview of the IPsec_GW reference pipeline (Transport and Tunnel mode). It provides details of the P4 program that you can load onto an AMD Pensando™ second generation ("Elba") data processing unit (DPU), or run within the provided x86 simulator. P4 pipeline programmability gives flexible software-defined constructs that enable you to develop your networking software quickly, load it onto an AMD Pensando DPU, and test it. The reference pipelines supply P416 source code for P4I and P4E (protocol processing) modules and P416-based libraries in binary format for P4 RxDMA and P4 TxDMA that you can access via APIs for handling message transfer with host and local CPU. This combination of source code and libraries facilitates the implementation of the provided reference pipeline. The provided P4 binary can be deployed alongside your P4 code on the same DPU to integrate the functionality into the customer's system. This approach streamlines the implementation process and ensures seamless compatibility.

In cloud service provider, enterprise, and public sector environments, encryption services are commonly delivered via Edge/VPN gateway physical or virtual (VNF) appliances. However, this approach compromises performance and drives up costs if the encryption offering needs to scale the number of tunnels or if high IPsec throughput is required. Physical appliances tend to increase in cost and size as performance requirements increase for both throughput and tunnel scale. VNF or other software-only solutions typically pin a tunnel to a single CPU and can only achieve a peak throughput of 1.25-2.5 Gb/s per tunnel. Performance challenges of software-only IPsec solutions include:

High CPU utilization
Encryption and decryption of IPsec packets can be CPU-intensive, leading to high CPU utilization on the server or appliance. This can impact the performance of other applications running on the same server or appliance.
Limited throughput
CPU speed limits the throughput of an IPsec tunnel on a software-only solution. This can be a problem for networks that need to support high-bandwidth traffic.
Encryption and decryption of IPsec packets can add latency to network traffic. This can be a problem for applications that are sensitive to latency, such as VoIP and video conferencing.

When there is a need to encrypt high-speed links or provide a scalable encryption service that does not consume racks of CPUs, P4-programmable DPUs provide a more scalable and performant solution.

Customers of cloud providers often want to encrypt on-ramp circuits between colocation facilities, enterprise data centers, and their cloud resources. High-speed cloud on-ramp circuits offer sub 1 Gb/s to 100 Gb/s links with one or more IPsec tunnels, which is challenging to encrypt using current IPsec implementations that rely on CPUs.

The AMD Pensando DPU can offload encryption services from the x86 server, significantly increasing throughput per tunnel and the number of tunnels supported, without requiring additional compute resources from a software-only solution (VNF) or relying on large appliances. Third-party vendors can use the AMD Pensando DPU to improve the IPsec throughput and tunnel scale of their appliance offerings, while also reducing their footprint.

The benefits of using an AMD Pensando DPU for encryption services include:

DPUs and current SSDK software can scale to support up to 64,000 encrypted tunnels.
DPUs can encrypt and decrypt IPsec packets at line rate without impacting the performance of other applications.
DPUs are a more cost-effective solution using domain-specific encryption and network services accelerators versus dedicated crypto engines and CPUs.
The AMD Pensando DPU can support multi-service offerings, leveraging a flow-based approach, additional networking and security functions with encryption (Packet Rewrite, SDN Policy Offload, Flow Offloads, NAT, Stateful Firewall, Observability, and Massive Control and Data plane scale), or a policy-based VPN for stateless environments.
Flexible encryption
P4 programmability enables you to select what type of traffic should be encrypted and how the traffic is mapped to service associations (SAs) and IPsec tunnels.

The IPsec_GW P4 reference pipeline is a robust method to enhance security without sacrificing network performance. The AMD Pensando DPU and P4 pipeline have achieved 100 Gb/s for a single IPsec tunnel and up to 260 Gb/s of bidirectional throughput for a single DPU. The IPsec_GW reference pipeline is a bump-in-the-wire (BITW) implementation ideal for a SmartSwitch or an appliance form factor, as shown in the following figure. When deployed inside a SmartSwitch as a top-of-rack device or in an appliance, the DPU accelerates IPsec services for any traffic that enters or leaves the device. A DPU with the IPsec_GW reference pipeline typically connects to a switching ASIC as a bump-in-the-wire. Alternatively, the BITW approach includes an appliance mode using an x86 system with DPUs deployed in PCIe® slots. It takes unencrypted packets from the wire, encrypts them at line rate, and sends them back to the switching ASIC to be forwarded to their next hop or destination. To scale out performance, multiple DPUs can be connected to an ASIC.

Figure 1. BITW Implementation for SmartSwitch Appliances

The IPsec pipeline can also be enhanced to support a host-to-network deployment, as shown in the following figure. This allows offloads to be performed on a per-compute node basis.

Figure 2. Host to Network Implementation