The main function contains the following sections marked in the source accordingly.
Environment / Usage Check
Common Parameters:
numBuffers
: Not expected to be modified. This parameter is used to determine how many kernel invocations are performed.oooQueue
: This boolean value is used to declare the kind of OpenCL event queue that is generated inside the ApiHandle.processDelay
: This parameter can be used to artificially delay the computation time required by the kernel. This parameter is not used in this version of the tutorial.bufferSize
: This parameter is used to declare the number of 512-bit values to be transferred per kernel invocation.softwarePipelineInterval
: This parameter is used to determine how many operations can be pre-scheduled before synchronization occurs.
Setup: To ensure that you are aware of the status of configuration variables, this section prints out the final configuration.
Execution: In this section, you can model several different host code performance issues. These are the lines you will focus on for this tutorial.
Testing: After execution has completed, this section performs a simple check on the output.
Performance Statistics: If the model is run on an actual accelerator card (not emulated), the host code will calculate and print the performance statistics based on system time measurements.
NOTE: The setup, as well as the other sections, can print additional messages recording the system status, as well as overall
PASS
orFAIL
of the run.