For system comprised of AI Engines, PL, and PS, ensure your design follows these prerequisites to analyze performance:
- Hardware
-
Use PL IP, such as APM IP, RTL/HLS kernels, to calculate throughput latency in the system to measure cycle-accurate performance.
-
- Software
- Use the Linux operating system (OS) to use the profiling APIs provided by aiecompiler, available as part of the software platform. For more information on enabling Linux and building collaterals as well as details on the profiling APIs, see the AI Engine Tools and Flows User Guide (UG1076).
- Use a PS application to coordinate the entire system that runs on Arm® Cortex®-A72 (for example, start data traffic from PL/DDR, start counters, etc.). The PS application needs to be cross-compiled against the G++/GCC compiler for the Cortex-A72. The PS application can also be used to read out RTL/HLS/APM counters to measure throughput and latency.
- After the design is implemented on silicon, use the Linux OS to take advantage of profiling application programming interfaces (APIs) provided by aiecompiler.