The goal of the first stage is to determine if the design is functionally correct and can run successfully in hardware. You can also analyze results, and proceed to the next stage for further debug and analysis.
The following figure shows the tasks and techniques available in this stage.
This stage can help you determine if the design and host application can run successfully in hardware. In this stage you can use APIs in your host application to profile the design as it is running on hardware, and determine if the design meets throughput, latency and bandwidth goals. In addition, you can troubleshoot AI Engine stalls and deadlocks using reports generated when running the design in hardware.
The sections below list the techniques available in this stage.
Error Handling and Reporting in Host Application in Hardware
On Linux, XRT provides error reporting APIs. Use these APIs in your
host application to catch and report these errors in order to get to the root cause
of the issue. For details on the XRT error reporting APIs, see AI Engine Error Reporting. To examine error messages reported by the AI Engine
array, enable and examine the dmesg
logs. Details
on AI Engine array-specific error handling can be found in AI Engine Error Events.
Next stage: Proceed to stage 5 if you determine that the host application needs to be debugged further.
Analyzing Design Stalls in Hardware
If you encounter design stalls in hardware on Linux, track the status of the AI Engine and PL kernels in the design using the XRT XBUtil utility on Linux. For more information on how to use the utility to generate a report on the current design status and to visualize the results in the Vitis IDE, see Analyzing AI Engine Status in Hardware.
You can also manually generate the report in XSDB and visualize the results in the Vitis IDE. More more details, see Generating AI Engine Status. Some examples of deadlock visualizations in the Vitis IDE are found in xrz1645747295229.html#xrz1645747295229__fig_f23_n35_ctb and xrz1645747295229.html#xrz1645747295229__fig_h3j_hys_5sb.
Next stage: Proceed to stage 2 if you determine that the design is stalled and further details are needed to get to the root cause of the stall.
Reporting Design Throughput, Latency, Bandwidth in Hardware
You can also determine the AI Engine graph throughput, latency and bandwidth by profiling the graph inputs and outputs via APIs in the host application. Careful consideration is needed on when profiling the API is started and stopped in the host. For details on how to use the APIs in the host application, see Event Profile APIs for Graph Inputs and Outputs.
Next stage: Proceed to stage 2 if you determine that the design has sub-par throughput or latency, or the design does not meet the performance goal. In stage 2, you can pinpoint the kernel or I/O that might be contributing to the performance drop.