AI Engine Resource Management - 2024.1 English

AI Engine System Software Driver Reference Manual (UG1642)

Document ID
Release Date
2024.1 English

The AI Engine driver provides resource management. ADF or XRT uses the AI Engine driver to request and release required resources. The AI Engine driver also provides resource management internally for AI Engine driver APIs, such as errors broadcast network setup and ECC perfcounter setup.

ADF or XRT does not directly call the AI Engine driver request and release APIs to allocate and release resources. Instead, it uses FAL, and the FAL calls the AI Engine driver APIs. On Linux, there is resource management in the AI Engine partition kernel driver. As there can be multiple processes accessing the same partition, the kernel driver resource management is required to manage what resources are available. The following is a diagram of the resource management on Linux

Figure 1. Linux Runtime Resource Management

The AI Engine compiler calls the AI Engine driver and the FAL APIs to request resources during CDO generation. The resources reserved during compilation needs to be passed onto the runtime. The AI Engine driver allows APIs to dump all the reserved resources to raw binary into a file defined in the AI Engine compiler. The AI Engine compiler needs to package the resource binary file into libadf.a so it is included in xclbin. At runtime, the zocl kernel driver passes the resource binary file to the AI Engine device kernel driver. On non-XRT cases, such as bare-metal, the AI Engine driver provides a separate function to pass the resource binary data to the AI Engine driver. Resources allocated at compilation time are not released at runtime.

Figure 2. Resource Reservation During Compilation Time
Figure 3. Resource Reservation at Compilation Passed onto Runtime in Linux XRT Case
Figure 4. Resource Reservation at Compilation Passed onto Runtime in Bare-metal Case

At runtime, if the application wants to use reserved resources, it needs to specify which resources to use to FAL class during the object construction. FAL checks with the AI Engine driver to confirm if it is already reserved during compilation. To use resources reserved during compilation, the user application needs to reserve a specific resource at runtime. Once the user application reserves it, no other user applications can use it. User applications can free this preserved resource and keep it statically reserved or release it so that it can be available in other resource allocation.

As of 2023.2, the XRT or aiecompiler interfaces with the functional abstraction layer (FAL) to request and reserve resources within the AI Engine. The FAL, in turn, communicates with the resource manager component of the user space driver to ascertain resource availability. This resource manager maintains bitmaps representing the status of all resources in the AI Engine array. On Linux-based systems, the user space driver interacts with the Linux kernel driver to access these bitmaps.

As of 2024.1, the software architecture involves optimizing resource management within the AI Engine driver. The previous approach relies on the user space driver to maintain resource manager bitmaps and it poses challenges in terms of memory usage and dynamic memory allocation during runtime. To address these issues, the new implementation proposes relocating the resource manager functionality from the user space driver to the FAL.