VSC provides a simple template for application code development and managing software calls to the accelerator hardware. Regardless of the hardware composition this template provides a unified style of creating a C++ software API.
auto arg1BP = myACC::create_bufpool( .... ); (1)
....
myACC::send_while([= .... ]() // (2)
{
int* arg1 = myAcc::alloc_buf<int>(arg1BP, .... ); // (4)
....
myACC::compute( arg1, .... ); // (5)
....
return ( while_cond ); // (3)
}
myACC::receive_all_in_order([= .... ]() // (6)
{
....
}
myACC::join(); // (7)
The structure of this template shown in this pseudo-code above has the following parts. Refer to VPP_ACC Class API for details of these elements.
-
create_bufpool()- Creates a buffer pool for each pointer argument passed to the accelerator and provides the specification of the argument data (for example: input, output, and remote). -
send_while()- A thread to control the overall scheduling of jobs on the accelerator, providing data for each job by using a lambda function. -
return ( while_cond );- The body of the lambda function executes in a loop and must return a Boolean value. Thereturnstatement in thesend_whilebody allows the user to continue running the loop as long as the value returned is true, and to stop the loop and exit thesend_whilethread when the value returned is false. Areturnstatement could be (++sent_value<MAX_SEND) ifsent_valuewas set to 0 before declaring and using the lambda function. -
alloc_buf()- Allocation of a memory buffer object from the buffer pool for the current loop iteration. -
compute()- The software call to schedule one job execution of thecompute()function on the accelerator hardware. -
receive_all_in_order()- A thread that waits on the results from the scheduled jobs. This is another user-defined lambda function and executes in a loop as long as thesend_while()thread runs. -
join()- Waits on the completion of the send and receive threads.
The following sections provide additional details for creating the application code.