AI Engine to PL Interface AXI Protocol - 2024.1 English

AI Engine Kernel and Graph Programming Guide (UG1079)

Document ID
UG1079
Release Date
2024-06-05
Version
2024.1 English

AI Engine to PL AXI4-Stream interfaces support a subset of the AXI4-Stream protocol (https://developer.arm.com/documentation/ihi0051/a/Introduction/About-the-AXI4-Stream-protocol). Internally, AI Engine to PL AXI4-Stream interfaces are 64-bit channels physically. And 128-bit occupies two adjacent physical channels. This imposes additional requirements on sending data timely between AI Engine and PL via AXI4-Stream interfaces. This section focuses on TLAST and TKEEP requirements of the interfaces to send data without stall.

TLAST is required for a 64-bit stream between the AI Engine and PL if single 32-bit words are sent. AI Engine to PL 32-bit stream interfaces are automatically internally up-sized to 64-bit interfaces by the AI Engine compiler. When sending 32-bit stream data (to or from the PL from the AI Engine), single 32-bit words without TLAST are held in the interface until a second 32-bit word arrives to complete a 64-bit up-sizing. The solution is to assert TLAST for the single 32-bit data. The data will be pushed into AI Engine without stall.

When using 64-bit and 128-bit interfaces, it is valid to send data without TLAST. Then, even number of data are sent without stall. TLAST can be used depending on the need. When TLAST is asserted, TKEEP can be used together with TLAST to send arbitrate number of 32-bit data for 64-bit and 128-bit interfaces. TKEEP must be set correctly, either -1 (all bits are 1) or partial 32-bit words are enabled, for example 0x0F. The lower parts of the data should be asserted if only partial data is valid.

The following table summarizes TLAST and TKEEP usage to send an odd number of 32-bit words without stall for AI Engine to PL AXI4-Stream interfaces:
Table 1. TLAST and TKEEP Usage for Sending Odd Number of 32-bit Words to AI Engine
Interface Bit Width TLAST TKEEP Note
32-bit 1 -1 32-bit word to send without stall.
64-bit 1 0x0F Lowest 32-bit word to send without stall.
128-bit 1 0x000F Lowest 32-bit word to send without stall.
128-bit 1 0x00FF Lowest two 32-bit word to send without stall.
128-bit 1 0x0FFF Lowest three 32-bit word to send without stall.
Note: It is not valid to send only the highest partial data to AI Engine from PL interfaces. For example, setting TLAST=1 & TKEEP=0xF0 for sending only the highest 32-bit word is not valid.
The last odd number of 32-bit word from AI Engine to PL will also stall without asserting TLAST, even with 32-bit AI Engine to PL interface. To make sure the last odd number is pushed out to PL, assert TLAST inside AI Engine kernel by:
writeincr(out,value,true);//"true" to assert TLAST