Stage 1: Design Execution and System Metrics

Stage 1: Design Execution and System Metrics - 2022.1 English

Versal ACAP AI Engine Programming Environment User Guide (UG1076)

Document ID

UG1076

Release Date

2022-05-25

Version

2022.1 English

The goal of the first stage is to determine if the design is functionally correct and can run successfully in hardware. You can also analyze results, and proceed to the next stage for further debug and analysis.

The following figure shows the tasks and techniques available in this stage.

Figure 1. Design Execution and System Metrics

This stage can help you determine if the design and host application can run successfully in hardware. In this stage you can use APIs in your host application to profile the design as it is running on hardware, and determine if the design meets throughput, latency and bandwidth goals. In addition, you can troubleshoot AI Engine stalls and deadlocks using reports generated when running the design in hardware.

The sections below list the techniques available in this stage.

Error Handling and Reporting in Host Application in Hardware

On Linux, XRT provides error reporting APIs. Use these APIs in your host application to catch and report these errors in order to get to the root cause of the issue. For details on the XRT error reporting APIs, see Error Reporting Through the XRT API. To examine error messages reported by the AI Engine array, enable and examine the dmesg logs. Details on AI Engine array-specific error handling can be found in AI Engine Error Events.

Next stage: Proceed to stage 5 if you determine that the host application needs to be debugged further.

Note: For errors flagged on the AI Engine, the error handling table provides guidance on next steps.

Analyzing Design Stalls in Hardware

If you encounter design stalls in hardware on Linux, track the status of the AI Engine and PL kernels in the design using the XRT XBUtil utility on Linux. For more information on how to use the utility to generate a report on the current design status and to visualize the results in Vitis Analyzer, see Analyzing AI Engine Status in Hardware.

You can also manually generate the report in XSDB and visualize the results in Vitis Analyzer. More more details, see Generating AI Engine Status. Some examples of deadlock visualizations in Vitis Analyzer are found in Figure 1 and Figure 2.

Next stage: Proceed to stage 2 if you determine that the design is stalled and further details are needed to get to the root cause of the stall.

Reporting Design Throughput, Latency, Bandwidth in Hardware

You can also determine the AI Engine graph throughput, latency and bandwidth by profiling the graph inputs and outputs via APIs in the host application. Careful consideration is needed on when profiling the API is started and stopped in the host. For details on how to use the APIs in the host application, see Event Profile APIs for Graph Inputs and Outputs.

Next stage: Proceed to stage 2 if you determine that the design has sub-par throughput or latency, or the design does not meet the performance goal. In stage 2, you can pinpoint the kernel or I/O that might be contributing to the performance drop.