Profiling using C++ Class API - 2023.2 English

Vitis Tutorials: AI Engine (XD100)

Document ID
Release Date
2023.2 English

The code to use C++ class API is common for Linux system for various platforms. The Timer is defined as follows:

class Timer {
    std::chrono::high_resolution_clock::time_point mTimeStart;
           Timer() { reset(); }
           long long stop() {
           std::chrono::high_resolution_clock::time_point timeEnd = std::chrono::high_resolution_clock::now();
           return std::chrono::duration_cast<std::chrono::microseconds>(timeEnd - mTimeStart).count();
          void reset() { mTimeStart = std::chrono::high_resolution_clock::now(); }

The code to start profiling is as follows:

Timer timer;

The code to end profiling and calculate performance is as follows:

double timer_stop=timer.stop();
double throughput=(BLOCK_SIZE_in_Bytes+BLOCK_SIZE_out_Bytes)*NUM/timer_stop;
std::cout<<"Throughput (by timer GMIO in num="<<num<<",out num="<<num<<"):\t"<<throughput<<"M Bytes/s"<<std::endl;

The code is guarded by macro __TIMER__. To use this method of profiling, define __TIMER__ for g++ cross compiler in sw/Makefile:

CXXFLAGS += -std=c++17 -D__TIMER__ ......

To run it in hardware, use the following make command to build the hardware image:

make package TARGET=hw

After the package is done, run the following commands in the Linux prompt after booting Linux from an SD card (use petalinux/petalinux to login):

cd /run/media/mmcblk0p1
./host.exe a.xclbin

The output in hardware is similar as follows:

Throughput (by timer GMIO in num=4,out num=4):9882.79M Bytes/s