SAR BP Engine Design Proposal - 2025.1 English - XD100

Vitis Tutorials: AI Engine Development (XD100)

Document ID
XD100
Release Date
2025-08-25
Version
2025.1 English

Early prototyping of the various workloads identifies some common trends and clear conclusions:

  • The IFFT throughput (which was prototyped for the 4K-pt size as compared to the reduced-footprint 2K-pt size) of 100 Msps comfortably exceeds the 8 Msps requirement above assuming a 1G BP OPs/sec throughput. This conclusion will carry over safely to the 2K-pt size identified in the system modeling phase.

  • Most of the remaining compute workloads that operate pixel-by-pixel achieve a throughput in the mid 400 Msps range, with a few kernels achieving two to three times this much. It follows then, one straightforward approach is to design a SAR BP engine capable of ~400 Msps throughput based on these early prototype kernel architectures. Note the fmod_floor() prototype kernel does not meet this 400 Msps throughput target. Its implementation will need to be optimized using code refactoring and improved software pipelining (or be split across multiple instances). These details are left for the design phase.

  • The memory footprint of these kernels is large across the board. Each kernel typically requires a single compute tile but burns 2 to 4 tiles in memory.

  • Based on this early estimate of 31 tiles and the expectation that the size of the ifft2k() kernel will free up 5 or 6 tiles, it makes sense given the array geometry to provision for a $4\times 8$ engine configuration with 32 tiles to provide some resource margin.