Hardware Overview - 2022.1 English

Versal ACAP System Software Developers Guide (UG1304)

Document ID
Release Date
2022.1 English

This section provides an overview of the Versal ACAP hardware view components.

Figure 1. Device-level Interconnect Architecture

Note: For more detailed information about the Versal ACAP hardware, refer to the Versal ACAP Technical Reference Manual (AM011).

Key Hardware Components

The following list describes the largest hardware view components:

AI Engine
The AI Engine contains a scalar unit, a vector unit, load units, and a memory interface. The scalar unit contains a 32-bit scalar RISC processor with register files for general purpose, pointer, configuration, and backup registers, and a 32x32-bit scalar multiplier. The AI Engine also supports non-linear functions including sine/cosine, squareroot, and inverse-squareroot. Three address generator units (AGUs) are available: two dedicated as load units, and one dedicated as a store unit. The vector unit contains a 512-bit vector fixed-point / integer unit. Devices with AI Engines contain a single-precision floating point vector unit. Devices with an AIE-ML contain a fixed-point vector unit also used for Bfloat16 and FP32 support. The vector units in both the AI Engine and AIE-ML support concurrent operation on multiple vector lanes.

Within each AI Engine is a dedicated, single-port, 16 KB program memory 128-bit wide and 1k deep. The program memory supports instruction compression and has ECC protection and reporting.

The application processing unit (APU) consists of Cortex-A72 processor cores, L1/L2 caches, and related functionality. The Cortex-A72 cores and caches are part of Arm MPCore IP.

Versal ACAP uses a dual-core Cortex-A72 processor system with 1 MB L2 cache. The Cortex-A72 cores implement Armv8 64-bit architecture. The Cortex-A72 MPCore does not have integrated generic interrupt controller (GIC), so an external GIC IP is used. For more information, refer to "APU Processor Features" in Versal ACAP Technical Reference Manual (AM011).

AXI Interconnect
The advanced eXtensible interface (AXI) interconnect connects one or more memory mapped AXI master devices to one or more memory mapped peripheral devices. The AXI interfaces conform to the AMBA® AXI version 4 specifications from Arm, including the AXI4-Lite control register interface subset.
The interconnect for cache coherent interconnect for accelerators (CCIX) and PCIe® (CPM) module is the primary PCIe interface for the processing system. There are two integrated blocks for PCIe in the CPM, supporting up to Gen4 x16. You can configure both of the integrated blocks for PCIe as an endpoint. Furthermore, you can configure each integrated block as a root port that contains direct memory access (DMA) controller. The CPM CCIX functionality allows a PL accelerator to act as a CCIX compliant accelerator.
The programmable logic (PL) is a scalable structure that includes adaptable engines and intelligent engines that can be used to construct accelerators, processors, or almost any other complex functionality. It is configured using the Vivado® tools. The architect determines the components to be available in the PL design. For example, the MicroBlaze processor is an IP core, so you can optionally add MicroBlaze processors to the design. For more information on the PL, see MicroBlaze Processor Reference Guide (UG984).
The platform management controller (PMC) handles device management control functions such as device reset sequencing, initialization, boot, configuration, security, power management, dynamic function eXchange (DFX), health-monitoring, and error management. You can boot the device in either secure or non-secure mode. For more information, refer to "Platform Management Controller" in Versal ACAP Technical Reference Manual (AM011).
NoC Interconnect
The NoC is the main interconnect and contains a vertical component (VNoC) and a horizontal component (HNoC).
  • HNoC is integrated in the horizontal super row/region (HSR). The HSR includes blocks such as XPIO, hard DDR memory controller, PLL, HBM, and AI Engine.
  • VNoC integration includes the global-clk-column. In SSI technology, VNoCs are connected across super logic region (SLR) boundaries. Microbumps and buffers for this reside in the Thin-HNoC. Configuration data between SSI technology master and slaves travels over the NoC.
The real-time processing unit (RPU) is a dual-core Cortex-R5F processor, based on the Armv7-R architecture with a floating point unit, which can run as either two independent cores or in a lock-step configuration. For more information, refer to Platform Management in Versal ACAP Technical Reference Manual (AM011).
System Memory Management Unit
The system memory management unit (SMMU) supports memory virtualization for peripherals. The main functions of the SMMU include logical memory protection by performing address translation, transaction security state control, as well as blocking peripherals if configured to do so.

These functions are performed with a combination of the seven translation buffer units (TBU 0 to 6). Four of these are in the path of incoming AXI interfaces outside of the FPD to the CCI. The translation and protection tables that are cached in the TBU are updated by the SMMU translation control unit (TCU).

For more information on the SMMU, see Chapter 43 in the Versal ACAP Technical Reference Manual (AM011).

Cache Coherent Interconnect
The cache coherent interconnect (CCI) is based on the Arm CCI-500 with its snoop filter (SF) table feature. It provides tight memory coherency between the APU L2 cache and a PL system cache using the ACE interface protocol to support multiple heterogeneous processing environments. It is part of the FPD interconnect.

For more information on the CCI, see Chapter 44 in the Versal ACAP Technical Reference Manual (AM011).

Additional Hardware Components

Peripheral Controllers
The Input/Output peripherals are present in low power domain (LPD) and PMC domain (PPD). The flash memory controllers (FMC) are located in PMC. Their I/O signals are routed to device pins via the PMC MIO multiplexer.

For more information, refer to the I/O Peripherals and FMC sections in Versal ACAP Technical Reference Manual (AM011).

Interconnects and Buses
Versal ACAP has following additional interconnects and buses:
The NoC programming interface, a 32-bit programming interface to the NoC and several attached units.

For more information, refer to Versal ACAP Programmable Network on Chip and Integrated Memory Controller LogiCORE IP Product Guide (PG313).

The advanced peripheral bus (APB) is a 32-bit single-word read/write programming interface. This bus is used to access control registers in the functional units, i.e., subsystem units. These control registers are used to program the functional units. The APB switch is used as the interconnect switch in the following four areas:
  • PMC
  • LPD
  • FPD
  • CPM
The configuration frame interface (CFI) transports PL and integrated hardware configuration information contained in the boot image from the PMC to its destination within the Versal device. CFI provides a dedicated high-bandwidth 128-bit bus to PL for configuration and readback. For more information, refer to the Programming Interfaces chapter in Versal ACAP Technical Reference Manual (AM011).
System Watchdog Timer
The system watchdog (SWDT) timer is used to detect and recover from various malfunctions. The watchdog timer can be used to prevent system lockup (when the software becomes trapped in a deadlock). For more information, refer to "System Watchdog Timer" in Versal ACAP Technical Reference Manual (AM011).
Versal ACAP has the following clocks:
  • PMC and PS clocks
  • CPM clocks
  • NoC, AI Engine, and DDR memory controller clocks
  • PL clocks: The PL includes its own clock arrays that are programmed when blocks are instantiated. The PL also includes programmable clock modules that can be driven by clocks from input pins and other sources.

For more information, see the Versal ACAP Clocking Resources Architecture Manual (AM003) and Versal ACAP Technical Reference Manual (AM011).

Versal device has following list of memories:
DDR memory
Up to 4096 GB of RAM is supported. This DDR memory is external to the device.
On-chip memory (OCM) in the PS
This memory is 256 KB in size, and is accessible to the RPU and APU processors via the LPD OCM interconnect switch.
Accelerator RAM
The 4 MB accelerator RAM (XRAM) is available in some Versal® AI Core series. The XRAM is divided into four separate memory banks with four system interfaces: an AXI port from the LPD PS and three PL AXI ports.

The XRAM supports simultaneous access by each port to its associated bank. It also allows full cross-bank access from any port to any bank. For details please refer to XRAM Memory chapter in the Versal ACAP Technical Reference Manual (AM011).

Tightly coupled memory (TCM) in the RPU
This memory is 256 KB and is mainly used by the RPU but can be accessed by the APU.
Battery-backed RAM (BBRAM)
This memory can store the advanced encryption standard (AES) 256-bit key.
Contains user memory to store multiple keys and security configuration settings.
Versal ACAP has several layers of resets with overlapping effects. The highest-level resets are generally aligned with power domains, then power island resets, and finally the individual functional unit resets. In some cases, functional units have local resets that affects part of the block. The reset hierarchy:
  • Subsystem resets (power domains)
  • Power-island resets
  • Functional unit (block) resets
  • Partial resets of a block (some cases)

For more information, refer to the "Resets" chapter in Versal ACAP Technical Reference Manual (AM011).

The Versal device includes the following hardware components for virtualization:
  • CPU virtualization
  • Memory virtualization

For more information, refer to "Memory Virtualization" in Versal ACAP Technical Reference Manual (AM011).

Security and Safety
The Versal device has the following security management and safety features:
  • Secure key storage and management
  • Tamper monitoring and response
  • User access to Xilinx hardware cryptographic accelerators
  • Xilinx memory protection unit (XMPU) and Xilinx peripheral protection unit (XPPU) provides hardware-enforced isolation.
  • TrustZone

For more information, refer to "Platform Management Controller" in Versal ACAP Technical Reference Manual (AM011), Security, and Versal ACAP Security Manual (UG1508). This manual requires an active NDA to download from the Design Security Lounge.

For XMPU and XPPU, refer to "Memory Protection" in Versal ACAP Technical Reference Manual (AM011).