config_op - 2021.2 English

Vitis High-Level Synthesis User Guide (UG1399)

Document ID
UG1399
Release Date
2021-12-15
Version
2021.2 English

Description

Sets the default options for micro-architecture binding of an operator (add, mul, sub...) to an FPGA implementation resource, and specify its latency.

Binding is the process in which operators (such as addition, multiplication, and shift) are mapped to specific RTL implementations. For example, a mult operation implemented as a combinational or pipelined RTL multiplier.

This command can be used multiple times to configure the default binding of different operation types to different implementation resources, or specify the default latency for that operation. The default configuration defined by config_op can be overridden by specifying the BIND_OP pragma or directive for a specific design element.

Syntax

config_op [OPTIONS] <op>
<op>
Specifies the type of operation for the specified variable. Supported values include:
  • mul: integer multiplication operation

  • add: integer add operation

  • sub: integer subtraction operation

  • fadd: single precision floating-point add operation

  • fsub: single precision floating-point subtraction operation

  • fdiv: single precision floating-point divide operation

  • fexp: single precision floating-point exponential operation

  • flog: single precision floating-point logarithmic operation

  • fmul: single precision floating-point multiplication operation

  • frsqrt: single precision floating-point reciprocal square root operation

  • frecip: single precision floating-point reciprocal operation

  • fsqrt: single precision floating-point square root operation

  • dadd: double precision floating-point add operation

  • dsub: double precision floating-point subtraction operation

  • ddiv: double precision floating-point divide operation

  • dexp: double precision floating-point exponential operation

  • dlog: double precision floating-point logarithmic operation

  • dmul: double precision floating-point multiplication operation

  • drsqrt: double precision floating-point reciprocal square root operation

  • drecip: double precision floating-point reciprocal operation

  • dsqrt: double precision floating-point square root operation

  • hadd: half precision floating-point add operation

  • hsub: half precision floating-point subtraction operation

  • hdiv: half precision floating-point divide operation

  • hmul: half precision floating-point multiplication operation

  • hsqrt: half precision floating-point square root operation

  • facc: single precision floating-point accumulate operation

  • fmacc: single precision floating-point multiply-accumulate operation

  • fmadd: single precision floating-point multiply-add operation

Options

-impl [dsp | fabric | meddsp | fulldsp | maxdsp | primitivedsp | auto | none | all]
Defines the default implementation style for the specified operation. The default is to let the tool choose which implementation to use. The selections include:
  • all: All implementations. This is the default setting.

  • dsp: Use DSP resources

  • fabric: Use non-DSP resources

  • meddsp: Floating Point IP Medium Usage of DSP resources

  • fulldsp: Floating Point IP Full Usage of DSP resources

  • maxdsp: Floating Point IP Max Usage of DSP resources

  • primitivedsp: Floating Point IP Primitive Usage of DSP resources

  • auto: enable inference of combined facc | fmacc | fmadd operators

  • none: disable inference of combined facc | fmacc | fmadd operators

-latency <value>
Defines the default latency for the binding of the type to the implementation resource. The valid value range varies for each implementation (-impl) of the operation. The default is -1, which applies the standard latency for the implementation resource.
Tip: The latency can be specified for a specific operation without specifying the implementation detail. This leaves Vitis HLS to choose the implementation while managing the latency.
-precision [low | high | standard]
Applies to facc, fmacc, and fmadd operators. Specify the precision for the given operator.
  • low: Use a low precision (60 bit and 100 bit integer) accumulation implementation when available. This option is only available on certain non-Versal devices, and may cause RTL/Co-Sim mismatches due to insufficient precision with respect to C++ simulation. However, it can always be pipelined with an II=1 without source code changes, though it uses approximately 3X the resources of standard precision floating point accumulation.
  • high: Use high precision (one extra bit) fused multiply-add implementation when available. This option is useful for high-precision applications and is very efficient on Versal devices, although it may cause RTL/Co-Sim mismatches due to the extra precision with respect to C++ simulation. It uses more resources than standard precision floating point accumulation.
  • standard: standard precision floating point accumulation and multiply-add is suitable for most uses of floating-point, and is the default setting. It always uses a true floating-point accumulator that can be pipelined with II=1 on Versal devices, and II that is typically between 3 and 5 (depending on clock frequency and target device) on non-Versal devices.

Example 1

The following example binds the addition operation to the fabric, with the specified latency:

config_op add -impl fabric -latency 2

Example 2

The following example enables the floating point accumulator with low-precision to achieve II=1 on a non-Versal device:

config_op facc -impl auto -precision low