All the basic C++ data types are supported for
compute()
arguments: bool
,
char
, int
, short
,
long
, float
, double
. These
C++ type modifiers are supported as well: signed
,
unsigned
, short
and long
.
In addition, arbitrary precision data types are also supported, ap_int<N>
and ap_uint<N>
, as described in Arbitrary Precision (AP) Data Types in
the Vitis HLS tool. The ap_int
and ap_uint
are pre-defined
data types in Vitis HLS, where N is a integer
with a max limit of 1024. This allows finer bit-width control, especially to handle
data packing or to match a global memory data bus width.
The compute()
interface arguments can be classified as Scalar types
or Buffer types of arguments as described below.
Scalar Type
A scalar argument is a basic data-type that represents a single word passed
to the compute()
call. VSC will infer a scalar
argument type when any of the basic data types described earlier are used as a
pass-by-value argument of compute()
. For example
size
and value
are scalars here: compute(int size,
float value);
A user-defined C-struct can be used as a pass-by-value scalar argument, with the following rules:
- Struct cannot have a field with C++ bit specifiers, for
example:
int field_x:4;
- Struct cannot have a pointer field, for example:
int* ptr_field;
- The total byte-size of the struct must be a strict power of
two. If not, the user is required to declare a dummy field to fill in the
remaining bytes (e.g.
char pad[remainder]
).
compute(int size, float& result_value) // reference argument is not allowed
However, a C++ pointer type can be used in place of a reference when a PE function argument is still a reference. For example:
my_acc::compute(int inp, data_t *out_p) { // reference not allowed
my_PE(inp, *out_p); // deref the ptr for my_PE out_ref
my_PE_two(inp ... ); // passing inp to multiple PEs is allowed
}
my_acc::my_PE(int in, data_t &out_ref) { // reference is allowed
...
// application code
my_acc::send_while ... { ...
out_p = my_acc::alloc_buf(bp, 1); // required
my_acc::compute(inp, out_p);
VSC will infer a scalar for out_p
, when this output
argument when all these conditions apply:
- It must be pointing to only one element. Important: The application code must allocate memory of size 1 using
alloc_buf()
as described in VPP_ACC Class API. - The kernel code writes exactly one value to it, for example:
*out_p = result;
- No VSC guidance macros are specified on this argument.
A scalar argument of compute()
can be
passed to multiple PEs. In hardware a single AXI4-Lite port will drive individual scalar registers of every PE
function that takes this scalar argument.
Buffer Type
A buffer is a pointer or array argument to the compute()
function, which can hold one or more elements. The contents
of the buffer is transferred to/from the device based on the corresponding platform
interface (SYS_PORT
connection). The base
(element) type of the pointer or array argument can be any basic data types
described in the earlier section. In addition, a user-defined C-struct data type can
be also be used as a base element type.
alloc_buf
described in VPP_ACC Class API.When using a C-struct as a base type for an pointer or an array, the following rules apply:
- Struct cannot have field with C++ bit specifiers, for example:
int field_x:4;
- Struct cannot have a pointer field, for example:
int* ptr_field;
const, register, volatile
) are not-allowed at the
compute()
interface.The following example shows coding styles, both allowed and not-allowed, for buffer type arguments:
// accelerator interface
my_acc::compute(int A, int *B, int C[10]) {
...
// application code
int S, *BB, *CC;
my_acc::send_while ... { ...
BB = my_acc::alloc_buf( ... ); // required
CC = my_acc::alloc_buf( ... ); // required
compute (S, AA, BB);
In the example above, the compute()
arguments A and
B are treated the same way in the application code so that it is required to
allocate memory for these buffer arguments.
compute()
can only be passed to a single PE function within.In the example, A is a scalar input argument. The caller only passes an
integer argument to the compute()
call, the value
is transferred to the device using AXI4-Lite.
Whereas, B and C are buffer arguments. They can be either input, output,
bi-directional or remote. The caller has to allocate memory for this buffer using
alloc_buf()
.