Graph throughput can be defined as the average number of bytes produced (or
consumed) per second. The following example shows how to profile graph throughput
using the event API. In the example, gr
is the
application graph object, plio_out
is the PLIO
object connecting to the graph output port, and the graph is designed to produce 256
int32 data in eight iterations.
gr.init();
event::handle handle = event::start_profiling(plio_out, event::io_stream_start_to_bytes_transferred_cycles, 256*sizeof(int32));
gr.run(8);
gr.wait();
long long cycle_count = event::read_profiling(handle);
event::stop_profiling(handle);
double throughput = (double)256 * sizeof(int32) / (cycle_count * 1e-9); // byte per second
In the example, after the graph is initialized, event::start_profiling
is called to configure the AI Engine to count the clock cycles from the stream
start event to the event that indicates 256 × sizeof(int32)
bytes
have been transferred, assuming that the stream stops right after the specified
number of bytes are transferred. If the stream continues after the number of bytes
transferred, the counter will continue and never end. The first argument in event::start_profiling
is plio_out
, the second argument is set to event::io_stream_start_to_bytes_transferred_cycles
, and the third
argument specifies the number of bytes to be transferred before stopping the
counter. The graph throughput is derived by dividing the total number of bytes
produced in eight iterations (256 × sizeof(int32)
) by the time
spent from the first output data to the last output data (cycle_count × 1e-9, assuming the AI Engine is running at 1 GHz).