Data Communication via AI Engine Data Memory - 2025.2 English - UG1079

AI Engine Kernel and Graph Programming Guide (UG1079)

Document ID
UG1079
Release Date
2025-11-26
Version
2025.2 English

When multiple kernels fit in a single AI Engine, consecutive kernels can communicate through a shared buffer in either of the following:

  • The AI Engine’s local data memory, or
  • One of the three neighboring memories the AI Engine can access directly.

In this case, only a single buffer is needed because the kernels execute one after another in a round-robin fashion.

When kernels are in separate but neighboring AI Engines, they can communicate through the data memory module shared between the two neighboring AI Engine tiles that use ping-pong buffers. These buffers can be on separate memory banks to avoid access conflicts. The synchronization uses locks. The input and output buffers for the AI Engine kernel are ensured to be ready by the locks associated with the buffers. This type of communication saves routing resources and eliminates data transfer latency because DMA and AXI4-Stream interconnect are not required.