Processing System Performance - UG1192

AMD Design Conversion for Altera FPGAs and SoCs Methodology Guide (UG1192)

Document ID
UG1192
Release Date
2025-07-15
Revision
3.0.1 English

As the following table shows, the Zynq UltraScale+ MPSoC handily outperforms the Agilex 3C, without the need of the high-reliability real-time cores. When comparing the Zynq UltraScale+ MPSoC to the Agilex 5’s Cortex-A76, its performance is slightly lacking; however, there are other considerations that make MPSoCs a superior choice.

Neither the Cortex-A53 or the Cortex-A76 were designed for heavy number crunching. Industrial applications use processors for control plane actions where the CPUs in the MPSoC are almost always more than enough. Additionally, there are other resources available in the Zynq UltraScale+ MPSoC that make adding processor accelerators in the PL possible; an area where the Agilex 5 has limited resources.

Similarly, the AgilexCortex-A55 is not a real-time processor like the Cortex-R5F. While it does have a computational advantage, availability depends on the big.LITTLE configuration. Additionally, the Cortex-R5F can be run in lockstep, providing a much higher level of robustness over the Cortex-A55s.

Table 1. Zynq UltraScale+ MPSoC vs. Agilex 5 or Agilex 3C: PS/HPS Performance
  Zynq UltraScale+ MPSoC (APU) Zynq UltraScale+ MPSoC (RPU) Agilex 5 Agilex 3C
Features/Performance Cortex-A53 Cortex-R5F

Lockstep

Cortex-A76 Cortex-A55 Cortex-A55
F MAX (MHz) 1500 600 1800 1500 1500
Number of Cores 4 2 2 2 2
DMIPS/MHz/Core 3.13 1.67 6.8 3.0 3.0
Total DMIPS (K) 18.8 2.0 24.5 9 9

This next section questions if raw processing power is the answer to all computational ills. There are many aspects of choosing a device that must be considered.

  • The Cortex-A53, Cortex-A76, and Cortex-A55 were not designed for computationally rigorous tasks. The Cortex-A78AE processor in the Versal AI Edge Series Gen 2 and Versal Prime Series Gen 2 devices are intended as data-plane applications (that is, control processors) capable of general housekeeping, managing communications, device control and monitoring, and light computation. Where there are processing requirements beyond what any of these processors are capable of, many designers can also incorporate an AMD x86 with high-bandwidth PCIe connections to the Zynq UltraScale+ MPSoC APU or FPGA device.
  • Does the design running on a slower, typically lower power, processor meet the performance requirements?
  • Does part of the design require isolation for security or design robustness?
  • Are multiple cores required? Multiple cores enable you to segment your design into secure and non-secure regions as well as improving support for virtual machines and SMP environments.
  • Can computational heavy lifting be off-loaded to a PL accelerator? Because the Zynq UltraScale+ MPSoC has more bandwidth between the PS and PL, implementing co-processors or accelerators in the device’s programmable logic can greatly improve system performance in computationally demanding applications.

These design considerations acknowledge that while raw compute power is often important, there are other situations, design considerations, and mitigation techniques that can take precedence.

Table 2. Versal Architecture Cortex-A72 Based vs. Agilex 5 or Agilex 3C: PS Performance
  Versal Archiecture with Cortex-A72 (APU) Versal Archiecture with Cortex-R5F (RPU) Agilex 5 Agilex 3C
Features/Performance Cortex-A72 Cortex-R5F

Lockstep

Cortex-A76 Cortex-A55 Cortex-A55
F MAX (MHz) 1700 800 1800 1500 800
Number of Cores 2 2 2 2 2
DMIPS/MHz/Core 5.76 1.67 6.8 3.0 3.0
Total DMIPS (K) 19.5 2.67 24.5 9 4.8

Key advantages of the Versal architecture over Agilex 5 are listed:

Real-time and Safety-Critical Capabilities
The Cortex-R5F in Versal devices supports lockstep mode, achieving ASIL-C, whereas the Cortex-A76/Cortex-A55 cores in Agilex are limited to ASIL-B. This makes Versal devices more suitable for automotive, aerospace, and industrial safety applications.
Deterministic Real-time Processing
The Cortex-R5F provides low-latency, real-time execution, crucial for control loops and safety mechanisms.
Heterogeneous Acceleration
Versal devices enable seamless integration of offload accelerators in programmable logic (PL) or AI Engines (AIE). This can compensate for or surpass the raw CPU performance difference.
High-Speed, Configurable NoC
Versal devices network-on-chip (NoC) ensures fast, predictable data movement, optimizing performance for data-intensive applications.

For these reasons, the performance of the Agilex 5’s Cortex-A76 becomes less significant when considering the Versal architectures accelerator capabilities, real-time processing, and advanced interconnect. Further discussions on the architectural advantages of Versal devices are in the Versal Architecture Specific Architectural Features for Migration and PS-PL Connections sections of this guide.