Introduction - 2023.1 English

Vitis Tutorials: Hardware Acceleration (XD099)

Document ID
Release Date
2023.1 English

Aurora 64B/66B is a lightweight, serial communications protocol for multi-gigabit links. On AMD Alveo™ card, Aurora IP utilizes GT transceiver (such as GTY) to realize the high-speed data transfer. Each kind of Alveo accelerator cards has one or two QSFP28 ports, which connects to the GT transceiver of the FPGA. The user can integrate Aurora IP into Alveo card to implement full-duplex communication between cards at up to 100 Gbps data throughput on each QSFP28 port. The Aurora IP provides standard AXI Stream ports to the user application for data transfer, and with AMD Vitis™ flow, the users can easily integrate the Aurora IP into their acceleration design based on the Vitis target platform.

Following is the block diagram of Aurora 64B/66B communication channel.

Aurora Channel

For details on Aurora 64B/66B protocol, refer to Aurora 64B/66B Protocol Specification.
For details on Aurora 64B/66B IP, refer to Aurora 64B/66B IP Product Guide.

This tutorial provides an example design and step-by-step instruction for integrating Aurora IP into Alveo accelerator cards with Vitis flow. The example design integrates a four-lane Aurora kernel with 10 Gbps lane rate (achieve total 40 Gbps throughput). The complete design steps in this tutorial includes Aurora IP generation, reference RTL top module for Aurora IP, example test system integration, and example x86 host program. Following is the hardware block diagram of the example design.

Block Diagram

There are three kernels in the hardware design:

  • krnl_aurora: This is an RTL kernel. krnl_aurora instantiates the Aurora core IP, an AXIS data FIFO for data transmit, an AXIS data FIFO for data receive, and an AXI control slave for Aurora IP status read back.

  • strm_issue: This is an HLS kernel, which implements a simple AXI master to AXI stream bridge for data transmit. It reads the data from on-board global memory and sends them to the Aurora core.

  • strm_dump: This is an HLS kernel, which implements a simple AXI stream to AXI master bridge for data receive. It receives the data from Aurora core and writes them to the on-board global memory.

In the example design, host transfers block data into the on-board global memory, loads data to the Aurora core, and then stores the loopback data into on-board global memory for integrity check. To run the real hardware test of the design, you will need a 40 Gbps QSFP+ (0dB, 0W) loopback module inserted in the QSFP port of the Alveo cards. In case your Alveo card has two QSFP ports, insert the module into QSFP 0. The loopback module looks like below photo.

The design supports Ubuntu 18.04/20.04 and Redhat/CentOS 7/8 systems and is supported with following XRT and target platform version:

  • Alveo U200: xilinx_u200_gen3x16_xdma_2_202110_1

  • Alveo U250: xilinx_u250_gen3x16_xdma_4_1_202210_1

  • Alveo U280: xilinx_u280_gen3x16_xdma_1_202211_1

  • Alveo U50: xilinx_u50_gen3x16_xdma_5_202210_1

  • Alveo U55C: xilinx_u55c_gen3x16_xdma_3_202210_1

All the flows in the example design are provided as command line fashion, which utilize Makefile and Tcl scripts. During some steps in this tutorial, some GUI operations are used for explicit explanation purpose. Below is the files description of the design directory.

├── hls
│   ├── strm_dump.cpp                   # HLS C source code for strm_dump kernel
│   └── strm_issue.cpp                  # HLS C source code for strm_issue kernel
├── host
│   └── host_krnl_aurora_test.cpp       # x86 host program
├── krnl_aurora_test.cfg                # Vitis link configuration file
├── Makefile                            # Makefile for full flow
├── rtl
│   ├── krnl_aurora_control_s_axi.v     # Verilog source code for AXI control slave module
│   └── krnl_aurora.v                   # Verilog source code for top level of krnl_aurora
├── tcl
│   ├── gen_aurora_ip.tcl               # Tcl script to generate Aurora IP
│   ├── gen_fifo_ip.tcl                 # Tcl script to generate AXI stream data FIFO
│   └── pack_kernel.tcl                 # Tcl script to package the RTL kernel krnl_aurora
└── xdc
    └── aurora_64b66b_0.xdc             # additional XDC file for krnl_aurora

Notes: If you use RedHat/CentOS 7, the default installed GCC version is 4.x.x. You must use the following command to install and switch to GCC 7 before compiling the x86 host program.

sudo yum install centos-release-scl
sudo yum install devtoolset-7-gcc-c++
scl enable devtoolset-7 bash