A system-level view of program execution can be helpful in identifying problems during program execution including functional correctness and performance related challenges. While system level hardware and software emulation help during the development phase of the system, Xilinx offers flows around functional and performance debug in hardware as well. The AI Engine architecture has support for generation, collection, and streaming of profile related data, and events as trace data during hardware execution.