New Feature (How to Use It) - 2024.2 English - UG1399

Vitis High-Level Synthesis User Guide (UG1399)

Document ID
UG1399
Release Date
2024-11-13
Version
2024.2 English

Opcodes can implement a variety of operators, register configurations and cascading connections using intrinsic functions. When either cascades or accumulators are used, those corresponding opcodes are implemented using classes. Those are discussed below, and lists of supported opcodes and their functions or classes are given.

The template argument defines which registers are used inside the DSP.

These functions can be called in user code, with appropriate datatype conversions to and from user code variables, following the C++ and ap_int conversion rules.

For example, a DSP:

  • Computing the add-mul-add function (P = (A + D) * B + C).
  • With latency 2, due to the use of the A1 (A1 is one of the two registers on the A input of the DSP) and P registers (as specified using REG_A1 | REG_P as the template argument).
  • Using the DSP58 bitwidths (as specified by using namespace hls::dsp58;), can be instantiated as:

#include "hls_dsp_builtins.h"
using namespace hls::dsp58;
...
long test(long d, long a, long b, long c)
{
    return add_mul_add<REG_A1 | REG_P>(d, a, b, c);
}
Note: That an attempt to use the design above for a pre-Versal platform would result in an error when, since they do not support a 58-bit DSP.

List of Supported Intrinsics:

mul_add P = A * B + C template<int64_t flags> C_t mul_add(A_t a, B_t b, C_t c);
mul_sub P = A * B - C template<int64_t flags> C_t mul_sub(A_t a, B_t b, C_t c);
mul_rev_sub P = C - A * B template<int64_t flags> C_t mul_rev_sub(A_t a, B_t b, C_t c);
add_mul_add P = (A + D) * B + C template<int64_t flags> C_t add_mul_add(D_t d, A_t a, B_t b, C_t c);
add_mul_sub P = (A + D) * B - C template<int64_t flags> C_t add_mul_sub(D_t d, A_t a, B_t b, C_t c);
add_mul_rev_sub P = C - (A + D) * B template<int64_t flags> C_t add_mul_rev_sub(D_t d, A_t a, B_t b, C_t c);
sub_mul_add P = (A - D) * B + C template<int64_t flags> C_t sub_mul_add(D_t d, A_t a, B_t b, C_t c);
sub_mul_sub P = (A - D) * B - C template<int64_t flags> C_t sub_mul_sub(D_t d, A_t a, B_t b, C_t c);
sub_mul_rev_sub P = C - (A + D) * B template<int64_t flags> C_t sub_mul_rev_sub(D_t d, A_t a, B_t b, C_t c);

This is the list of registers that can be used to configure each intrinsic template to use the corresponding register in the RTL model:

  • REG_A1, REG_A2 on input A
  • REG_B1, REG_B2 on input B
  • REG_C
  • REG_D on input D
  • REG_AD at the output of the pre-adder
  • REG_M at the output of the multiplier
  • REG_P on the output of the post-adder (also used as accumulator)

The datatypes used for the arguments of the functions above differ depending on the platform. The different typedef defined in the header file, depending on the platform, are listed below for easy reference.

Datatype DSP48E1 DSP48E2 DSP58
A_t ap_int<25> ap_int<27> ap_int<27>
B_t ap_int<18> ap_int<18> ap_int<24>
C_t ap_int<48> ap_int<48> ap_int<58>
D_t ap_int<25> ap_int<27> ap_int<27>