VSC provides a simple template for application code development and managing software calls to the accelerator hardware. Regardless of the hardware composition this template provides a unified style of creating a C++ software API.
auto arg1BP = myACC::create_bufpool( .... ); (1)
....
myACC::send_while([= .... ]() // (2)
{
int* arg1 = myAcc::alloc_buf<int>(arg1BP, .... ); // (4)
....
myACC::compute( arg1, .... ); // (5)
....
return ( while_cond ); // (3)
}
myACC::receive_all_in_order([= .... ]() // (6)
{
....
}
myACC::join(); // (7)
The structure of this template shown in this pseudo-code above has the following parts. Refer to VPP_ACC Class API for details of these elements.
-
create_bufpool()
- Creates a buffer pool for each pointer argument passed to the accelerator and provides the specification of the argument data (for example: input, output, and remote). -
send_while()
- A thread to control the overall scheduling of jobs on the accelerator, providing data for each job by using a lambda function. -
return ( while_cond );
- The body of the lambda function executes in a loop and must return a Boolean value. Thereturn
statement in thesend_while
body allows the user to continue running the loop as long as the value returned is true, and to stop the loop and exit thesend_while
thread when the value returned is false. Areturn
statement could be (++sent_value<MAX_SEND
) ifsent_value
was set to 0 before declaring and using the lambda function. -
alloc_buf()
- Allocation of a memory buffer object from the buffer pool for the current loop iteration. -
compute()
- The software call to schedule one job execution of thecompute()
function on the accelerator hardware. -
receive_all_in_order()
- A thread that waits on the results from the scheduled jobs. This is another user-defined lambda function and executes in a loop as long as thesend_while()
thread runs. -
join()
- Waits on the completion of the send and receive threads.
The following sections provide additional details for creating the application code.