Introduction - 2022.2 English

Vitis Tutorials: Hardware Acceleration (XD099)

Document ID
XD099
Release Date
2022-12-01
Version
2022.2 English

Aurora 64B/66B is a lightweight, serial communications protocol for multi-gigabit links. On Alveo card, Aurora IP utilize GT transceiver (such as GTY) to realize the high speed data transfer. Each kind of Alveo accelerator cards have one or two QSFP28 ports, which connects to the GT transceiver of the FPGA. The user can integrate Aurora IP into Alveo card to implement full-duplex communication between cards at up to 100 Gbps data throughput on each QSFP28 port. The Aurora IP provides standard AXI Stream ports to user application for data transfer, and with Vitis flow, the users can easily integrate the Aurora IP into their acceleration design based on the Vitis target platform.

Following is the block diagram of Aurora 64B/66B communication channel.

Aurora Channel

For details on Aurora 64B/66B protocol please refer to Aurora 64B/66B Protocol Specification.
For details on Aurora 64B/66B IP please refer to Aurora 64B/66B IP Product Guide.

This tutorial will provide an example design and step-by-step instruction for integrating Aurora IP into Alveo accelerator cards with Vitis flow. The example design integrates a four-lane Aurora kernel with 10 Gbps lane rate (achieve total 40 Gbps throughput). The complete design steps in this tutorial includes: Aurora IP generation, reference RTL top module for Aurora IP, example test system integration, and example x86 host program. Following is the hardware block diagram of the example design.

Block Diagram

There are three kernels in the hardware design:

  • krnl_aurora: this is a RTL kernel. krnl_aurora instantiates the Aurora core IP, an AXIS data FIFO for data transmit, an AXIS data FIFO for data receive, and an AXI control slave for Aurora IP status read back.

  • strm_issue: this is a HLS kernel, which implements a simple AXI master to AXI stream bridge for data transmit. It read the data from on-board global memory and send them to the Aurora core.

  • strm_dump: this is a HLS kernel, which implements a simple AXI stream to AXI master bridge for data receive. It receive the data from Aurora core and write them to the on-board global memory.

In the example design, host transfers block data into the on-board global memory, loads the data to Aurora core, and then stores the loopback data into on-board global memory for integrity check. To run the real hardware test of the design, you will need a 40 Gbps QSFP+ (0dB, 0W) loopback module inserted in the QSFP port of the Alveo cards. In case your Alveo card has two QSFP ports, please insert the module into QSFP 0. The loopback module looks like below photo.