Interfaces and Address Mapping - Interfaces and Address Mapping - AM026

Versal AI Edge Series Gen 2 and Prime Series Gen 2 Technical Reference Manual (AM026)

Document ID
AM026
Release Date
2025-12-23
Revision
1.3 English
AXI4-Lite Interfaces
Three AXI4-Lite interfaces (AXI-A, AXI-B, AXI-C) manage memory access, supporting dedicated address ranges for efficient routing. This separation allows for priority handling across graphics, compute, and management functions.
  • AXI-A: Primary interface for core GPU functions.
  • AXI-B: Secondary interface for additional workload balancing.
  • AXI-C: Provides specialized access for control or secondary compute tasks.
PS high-speed connectivity
The PS high speed connectivity peripherals, including GPU, allow for a dedicated 16 MB address space, with the first 8 MB allocated for GPU operations. Within this 8 MB:
  • SLCR register space : A dedicated 64 KB region for the SLCR (System Level Control Registers), ensuring quick access to control and status registers.
  • Memory regions are designated as reserved for potential future expansions or software-defined configurations, allowing flexibility in safety and performance scaling.REserved Areas:
ACE-Lite Interface
  • Data Coherency: ACE-Lite supports cache coherency between the GPU and the processing subsystem, allowing data consistency in shared memory regions. This helps prevent redundant data transfers and ensures the GPU can access the most recent data modified by the CPU, ideal for compute-heavy applications that require synchronized data between the CPU and GPU.
  • Efficienty: Unlike the full ACE protocol, ACE-Lite has reduced complexity, supporting a subset of coherency operations. This lighter implementation lowers latency and minimizes bandwidth usage, making it suitable for embedded systems and safety-critical environments where efficiency is crucial.
  • Integration with AXI interface: Alongside the dedicated AXI-A, AXI-B, and AXI-C buses, ACE-Lite enables seamless data exchanges with the CPU without impacting the isolated nature of GPU partitions. It also optimizes memory access patterns and cache updates for data-intensive workloads, enhancing real-time processing and system responsiveness.
    Tip:

    ACE-Lite is a protocol for single directional data coherency, When the GPU reads the data which is written by the CPU, the GPU driver must invalidate the cache before the CPU can read and write the next set of data. Therefore, the GPU does not flush cache data.

Uses Cases in Safety-Critical Applications:

The ACE-Lite interface is particularly beneficial in applications such as automotive and industrial IoT, where the GPU needs to process large data streams (like sensor data or real-time video feeds) alongside the CPU. By ensuring efficient, coherent data sharing, ACE-Lite supports low-latency processing without compromising the GPU’s partitioning capabilities, crucial for applications where system integrity and quick data access are paramount.

The GPU is allocated an 8 MB section within the 16 MB PS high-speed connectivity address region

Table 1. GPU IP Address Map (Relative to Assigned Base Address)
Address Range Name Interface Size Description
0x00000000 – 0x002FFFFF GPU_A AXI-A 3072 KB Access for non-critical tasks, typically assigned to quality-managed clusters (e.g., Android VMs).
0x00300000 – 0x003FFFFF RESERVED - 1024 KB Reserved for future use
0x00400000 – 0x006FFFFF GPU_B AXI-B 3072 KB Dedicated to safety-critical workloads, ensuring isolated and secure processing.
0x00700000 - 0x0071FFFF GPU_C AXI-C 128 KB Allocated to safety islands responsible for resource allocation and error handling.
0x00720000 – 0x0072FFFF GPU_SLCR - 64 KB SLCR register space, providing control registers for GPU configuration.
0x00730000 – 0x007FFFFF RESERVED - 832 KB Reserved for future expansions and updates

Address Map Key Details

GPU_A (AXI-A): This region (0x00000000 – 0x002FFFFF) is allocated for general-purpose tasks, often non-critical applications, such as quality-managed software and virtual machines running on Android. AXI-A allows isolation of these less critical workloads.

GPU_B (AXI-B): Dedicated to critical applications (0x00400000 – 0x006FFFFF), AXI-B ensures that safety-critical operations have a dedicated address range, essential for secure and reliable execution.

GPU_C (AXI-C): The address range (0x00700000 – 0x0071FFFF) allocated to AXI-C provides access to safety islands, which manage resource allocation and error-handling mechanisms, further enhancing system integrity and operational resilience.

GPU_SLCR: This 64 KB address space (0x00720000 – 0x0072FFFF) is the SLCR register space, containing control and configuration registers that are critical for setting up and managing GPU operation modes.

The organization of the GPU IP address map in this structured manner ensures that each workload—whether non-critical or safety-critical—has a dedicated and isolated address space, reducing interference and enhancing security. The reserved sections are intended to support future expansions, adding flexibility for evolving use cases in automotive, healthcare, robotics, and other demanding sectors.

The GPU can be accessed through one of the three dedicated AXI buses. The three 32-bit AXI-Lite interfaces (namely, AXI-A, AXI-B and AXI-C) are routed via async bridges to MMI-CSwitch / XMPU to the Versal AI Edge Series Gen 2 and Versal Prime Series Gen 2 processor cores. In addition, there are two ACE-Lite interfaces for IO coherency support.

The GPU incorporates a hardware-based partition manager, along with supplementary system components, to enable sophisticated resource allocation and workload isolation. This partition manager introduces the concept of access windows and hardware separation, allowing the GPU to be divided into multiple independent partitions, effectively creating multiple self-contained GPUs within a single chip.

Safety and Security Mechanism

Error Correction and Recovery: ECC protects critical data paths, while built-in self-checks enable error detection and system recovery.

Partitioned Execution for Isolation: The vGPU partitions offer a unique fail-safe mechanism where a failure in one partition does not affect the performance or safety of other partitions, allowing continued operation for essential processes.

Fault Containment and Redundancy: Safety islands within the GPU architecture contain faults, providing redundancy paths that ensure fault isolation and functional integrity in critical tasks.

Performance Optimizations

DVFS (Dynamic Voltage and Frequency Scaling): Adjusts power consumption based on workload requirements, balancing performance and thermal output.

Workload Balancing: Flexible allocation of processing cores across graphics and compute tasks ensures efficient utilization and balanced resource management across safety-critical and non-critical workloads.

Figure 1. Flexible Partitioning

Supported APIs

  • OpenGL ES 3.2
  • Vulkan 1.3
  • Vulkan SC 1.0
  • OpenCL 3.0
  • Open GL SC 2.0

SC stands for Safety Critical. Vulkan SC and OpenGL ES SC are enhanced safety functional Vulkan and OpenGL ES.

Both Vulkan SC and OpenGL ES SC target for automotive, avionics, and medical and energy markets.