The AOCL libraries may be used to perform lengthy computations (for example, matrix multiplications and solver involving large matrices). These operations/computations may go on for hours.
AOCL Progress feature provides mechanism for the application to check the computation progress. The AOCL libraries (AOCL-BLAS and AOCL-LAPACK) periodically updates the application with progress made through a callback function.
Usage
The application must define the callback function in a specific format and register it with the AOCL library.
Callback Definition
The callback function prototype must be as defined as given follows:
dim_t AOCL_progress(const char* const api, const dim_t lapi, const dim_t progress,
const dim_t current_thread, const dim_t total_threads)
However, you can modify the function name as per your preference.
The following table explains different parameters passed to the callback function:
Parameter |
Purpose |
|---|---|
api |
Name of the API running currently |
lapi |
Length of the API name string (*api) |
progress |
Linear progress made in current thread presently |
current_thread |
Current thread ID |
total_threads |
Total number of threads used to performance the operation |
Callback Registration
The callback function must be registered with the library for reporting the progress. Each library has its own callback registration function. The registration can be done by calling:
AOCL_BLIS_set_progress(AOCL_progress); // for AOCL-BLAS
Example
The library only invokes the callback function at appropriate intervals, it is up to the user to consume this information appropriately. The following example shows how to use it for printing the progress to a standard output:
dim_t AOCL_progress(const char* const api, const dim_t lapi,
const dim_t progress,const dim_t current_thread,
const dim_t total_threads)
{
printf("\n%s, total thread = %lld, processed %lld element by thread %lld.",
api, total_threads, progress, current_thread);
return 0;
}
Register the callback with:
AOCL_BLIS_set_progress(AOCL_progress); // for AOCL-BLAS
The result is displayed in following format (output truncated):
$ BLIS_NUM_THREADS=5 ./test_gemm_blis.x
dgemm, total thread = 5, processed 11796480 element by thread 4.
dgemm, total thread = 5, processed 17694720 element by thread 0.
dgemm, total thread = 5, processed 5898240 element by thread 2.
dgemm, total thread = 5, processed 20643840 element by thread 0.
dgemm, total thread = 5, processed 14745600 element by thread 3.
dgemm, total thread = 5, processed 14745600 element by thread 4.
Limitations
The feature only shows if the operation is progressing or not, it doesn’t provide an estimate/ percentage compilation status.
A separate callback must be registered for AOCL-BLAS, AOCL-LAPACK, and AOCL-ScaLAPACK.