Why Packet Switching? - Why Packet Switching? - 2025.2 English - XD100

Vitis Tutorials: AI Engine Development (XD100)

Document ID
XD100
Release Date
2026-03-27
Version
2025.2 English

You might be curious about the need to implement the packet switching scheme 1:4/4:1. This is to circumvent an AI Engine architecture limitation on the number of simultaneous input and output AXI-Streams allowed per AI Engine column. There are 50 AI Engine columns in the AI Engine array. Each column contains eight AI Engine tiles. Each AI Engine column is allowed a maximum of six 32-bit AXI-Stream inputs and four 32-bit AXI-Stream outputs.

In the design, each nbody() kernel maps to an AI Engine tile. Meaning each column of eight AI Engine tiles has nine inputs streams and eight output streams. This violates these constraints.

  • 8 w_input_i input streams

  • 1 w_intput_j input stream

  • 8 w_output_i output streams

With the 1:4/4:1 packet switching scheme, you can combine four streams into one. Because packet switching is applied on the w_input_i ports, the number of input streams into a single AI Engine column is reduced to three:

  • 1 input_i stream that goes to tiles 0-3 in a column

  • 1 input_i stream that goes to tiles 4-7 in a column

  • 1 input_j stream that is broadcast to all the columns

On the output side, the number of output streams is reduced to two:

  • 1 output_i stream coming from tiles 0-3 in a column

  • 1 output_i stream coming from tiles 4-7 in a column