config_op - 2022.2 English

Vitis High-Level Synthesis User Guide (UG1399)

Document ID

UG1399

Release Date

2022-12-07

Version

2022.2 English

Description

Sets the default options for micro-architecture binding of an operator (add, mul, sub...) to an FPGA implementation resource, and specify its latency.

Binding is the process in which operators (such as addition, multiplication, and shift) are mapped to specific RTL implementations. For example, a mult operation implemented as a combinational or pipelined RTL multiplier.

This command can be used multiple times to configure the default binding of different operation types to different implementation resources, or specify the default latency for that operation. The default configuration defined by config_op can be overridden by specifying the BIND_OP pragma or directive for a specific design element.

Syntax

config_op [OPTIONS] <op>

<op>

Specifies the type of operation for the specified variable. Supported values include:

mul: integer multiplication operation
add: integer add operation
sub: integer subtraction operation
fadd: single precision floating-point add operation
fsub: single precision floating-point subtraction operation
fdiv: single precision floating-point divide operation
fexp: single precision floating-point exponential operation
flog: single precision floating-point logarithmic operation
fmul: single precision floating-point multiplication operation
frsqrt: single precision floating-point reciprocal square root operation
frecip: single precision floating-point reciprocal operation
fsqrt: single precision floating-point square root operation
dadd: double precision floating-point add operation
dsub: double precision floating-point subtraction operation
ddiv: double precision floating-point divide operation
dexp: double precision floating-point exponential operation
dlog: double precision floating-point logarithmic operation
dmul: double precision floating-point multiplication operation
drsqrt: double precision floating-point reciprocal square root operation
drecip: double precision floating-point reciprocal operation
dsqrt: double precision floating-point square root operation
hadd: half precision floating-point add operation
hsub: half precision floating-point subtraction operation
hdiv: half precision floating-point divide operation
hmul: half precision floating-point multiplication operation
hsqrt: half precision floating-point square root operation
facc: single precision floating-point accumulate operation
fmacc: single precision floating-point multiply-accumulate operation
fmadd: single precision floating-point multiply-add operation

Tip: Comparison operators, such as dcmp, are implemented in LUTs and cannot be implemented outside of the fabric, or mapped to DSPs, and so are not configurable with the config_op or bind_op commands.

Options

Defines the default implementation style for the specified operation. The default is to let the tool choose which implementation to use. The selections include:

all: All implementations. This is the default setting.
dsp: Use DSP resources
fabric: Use non-DSP resources
meddsp: Floating Point IP Medium Usage of DSP resources
fulldsp: Floating Point IP Full Usage of DSP resources
maxdsp: Floating Point IP Max Usage of DSP resources
primitivedsp: Floating Point IP Primitive Usage of DSP resources
auto: enable inference of combined facc | fmacc | fmadd operators
none: disable inference of combined facc | fmacc | fmadd operators

-latency <value>

Defines the default latency for the binding of the type to the implementation resource. The valid value range varies for each implementation (-impl) of the operation. The default is -1, which applies the standard latency for the implementation resource.

Tip: The latency can be specified for a specific operation without specifying the implementation detail. This leaves Vitis HLS to choose the implementation while managing the latency.

-precision [low | high | standard]

Applies to facc, fmacc, and fmadd operators. Specify the precision for the given operator.

low: Use a low precision (60 bit and 100 bit integer) accumulation implementation when available. This option is only available on certain non-Versal devices, and may cause RTL/Co-Sim mismatches due to insufficient precision with respect to C++ simulation. However, it can always be pipelined with an II=1 without source code changes, though it uses approximately 3X the resources of standard precision floating point accumulation.
high: Use high precision (one extra bit) fused multiply-add implementation when available. This option is useful for high-precision applications and is very efficient on Versal devices, although it may cause RTL/Co-Sim mismatches due to the extra precision with respect to C++ simulation. It uses more resources than standard precision floating point accumulation.
standard: standard precision floating point accumulation and multiply-add is suitable for most uses of floating-point, and is the default setting. It always uses a true floating-point accumulator that can be pipelined with II=1 on Versal devices, and II that is typically between 3 and 5 (depending on clock frequency and target device) on non-Versal devices.

Example 1

The following example binds the addition operation to the fabric, with the specified latency:

config_op add -impl fabric -latency 2

Example 2

The following example enables the floating point accumulator with low-precision to achieve II=1 on a non-Versal device:

config_op facc -impl auto -precision low