Description
Vitis HLS implements the operations in
the code using specific implementations. The BIND_OP pragma specifies that for a specific
variable, an operation (mul
, add
, div
) should be mapped to a specific
device resource for implementation (impl
) in the RTL. If
the BIND_OP pragma is not specified, Vitis HLS
automatically determines the resources to use for operations.
For example, to indicate that a specific multiplier operation (mul
) is implemented in the device fabric rather than a DSP, you
can use the BIND_OP pragma.
You can also specify the latency of the operation using the latency
option.
latency
option, the operation must have an available
multi-stage implementation. The HLS tool provides a multi-stage implementation for all basic
arithmetic operations (add, subtract, multiply, and divide), and all floating-point
operations.Syntax
Place the pragma in the C source within the body of the function where the variable is defined.
#pragma HLS bind_op variable=<variable> op=<type>\
impl=<value> latency=<int>
Where:
-
variable=<variable>
- Defines the variable to assign the BIND_OP pragma to. The variable in this case is one that is assigned the result of the operation that is the target of this pragma.
-
op=<type>
- Defines the operation to bind to a specific implementation resource.
Supported functional operations include:
mul
,add
, andsub
-
impl=<value>
- Defines the implementation to use for the specified operation.
-
latency=<int>
- Defines the default latency for the implementation of the operation.
The valid latency varies according to the specified
op
andimpl
. The default is -1, which lets Vitis HLS choose the latency.
Operation | Implementation | Min Latency | Max Latency |
---|---|---|---|
add | fabric | 0 | 4 |
add | dsp | 0 | 4 |
mul | fabric | 0 | 4 |
mul | dsp | 0 | 4 |
sub | fabric | 0 | 4 |
sub | dsp | 0 | 0 |
dcmp
, are implemented in LUTs and cannot be implemented outside
of the fabric, or mapped to DSPs, and so are not configurable with the config_op
or bind_op
commands.Operation | Implementation | Min Latency | Max Latency |
---|---|---|---|
fadd | fabric | 0 | 13 |
fadd | fulldsp | 0 | 12 |
fadd | primitivedsp | 0 | 3 |
fsub | fabric | 0 | 13 |
fsub | fulldsp | 0 | 12 |
fsub | primitivedsp | 0 | 3 |
fdiv | fabric | 0 | 29 |
fexp | fabric | 0 | 24 |
fexp | meddsp | 0 | 21 |
fexp | fulldsp | 0 | 30 |
flog | fabric | 0 | 24 |
flog | meddsp | 0 | 23 |
flog | fulldsp | 0 | 29 |
fmul | fabric | 0 | 9 |
fmul | meddsp | 0 | 9 |
fmul | fulldsp | 0 | 9 |
fmul | maxdsp | 0 | 7 |
fmul | primitivedsp | 0 | 4 |
fsqrt | fabric | 0 | 29 |
frsqrt | fabric | 0 | 38 |
frsqrt | fulldsp | 0 | 33 |
frecip | fabric | 0 | 37 |
frecip | fulldsp | 0 | 30 |
dadd | fabric | 0 | 13 |
dadd | fulldsp | 0 | 15 |
dsub | fabric | 0 | 13 |
dsub | fulldsp | 0 | 15 |
ddiv | fabric | 0 | 58 |
dexp | fabric | 0 | 40 |
dexp | meddsp | 0 | 45 |
dexp | fulldsp | 0 | 57 |
dlog | fabric | 0 | 38 |
dlog | meddsp | 0 | 49 |
dlog | fulldsp | 0 | 65 |
dmul | fabric | 0 | 10 |
dmul | meddsp | 0 | 13 |
dmul | fulldsp | 0 | 13 |
dmul | maxdsp | 0 | 14 |
dsqrt | fabric | 0 | 58 |
drsqrt | fulldsp | 0 | 111 |
drecip | fulldsp | 0 | 36 |
hadd | fabric | 0 | 9 |
hadd | meddsp | 0 | 12 |
hadd | fulldsp | 0 | 12 |
hsub | fabric | 0 | 9 |
hsub | meddsp | 0 | 12 |
hsub | fulldsp | 0 | 12 |
hdiv | fabric | 0 | 16 |
hmul | fabric | 0 | 7 |
hmul | fulldsp | 0 | 7 |
hmul | maxdsp | 0 | 9 |
hsqrt | fabric | 0 | 16 |
Example
In the following example, a two-stage pipelined multiplier using fabric
logic is specified to implement the multiplication for variable c
of the function foo
.
int foo (int a, int b) {
int c, d;
#pragma HLS BIND_OP variable=c op=mul impl=fabric latency=2
c = a*b;
d = a*c;
return d;
}
d
.