Quantization Modes - 2024.1 English

Vitis High-Level Synthesis User Guide (UG1399)

Document ID
UG1399
Release Date
2024-07-03
Version
2024.1 English
Rounding to plus infinity AP_RND
Rounding to zero AP_RND_ZERO
Rounding to minus infinity AP_RND_MIN_INF
Rounding to infinity AP_RND_INF
Convergent rounding AP_RND_CONV
Truncation AP_TRN
Truncation to zero AP_TRN_ZERO

AP_RND

  • Round the value to the nearest representable value for the specific ap_[u]fixed type.
    ap_fixed<3, 2, AP_RND, AP_SAT> UAPFixed4 = 1.25; // Yields: 1.5
    ap_fixed<3, 2, AP_RND, AP_SAT> UAPFixed4 = -1.25; // Yields: -1.0

AP_RND_ZERO

  • Round the value to the nearest representable value.
  • Round towards zero.
    • For positive values, delete the redundant bits.
    • For negative values, add the least significant bits to get the nearest representable value.
    ap_fixed<3, 2, AP_RND_ZERO, AP_SAT> UAPFixed4 = 1.25; // Yields: 1.0
    ap_fixed<3, 2, AP_RND_ZERO, AP_SAT> UAPFixed4 = -1.25; // Yields: -1.0

AP_RND_MIN_INF

  • Round the value to the nearest representable value.
  • Round towards minus infinity.
    • For positive values, delete the redundant bits.
    • For negative values, add the least significant bits.
    ap_fixed<3, 2, AP_RND_MIN_INF, AP_SAT> UAPFixed4 = 1.25; // Yields: 1.0
    ap_fixed<3, 2, AP_RND_MIN_INF, AP_SAT> UAPFixed4 = -1.25; // Yields: -1.5

AP_RND_INF

  • Round the value to the nearest representable value.
  • The rounding depends on the least significant bit.
    • For positive values, if the least significant bit is set, round towards plus infinity. Otherwise, round towards minus infinity.
    • For negative values, if the least significant bit is set, round towards minus infinity. Otherwise, round towards plus infinity.
    ap_fixed<3, 2, AP_RND_INF, AP_SAT> UAPFixed4 = 1.25; // Yields: 1.5
    ap_fixed<3, 2, AP_RND_INF, AP_SAT> UAPFixed4 = -1.25; // Yields: -1.5

AP_RND_CONV

  • Round to the nearest representable value with "ties" rounding to even, that is, the least significant bit (after rounding) is forced to zero.
  • A "tie" is the midpoint of two representable values and occurs when the bit following the least significant bit (after rounding) is 1 and all the bits below it are zero.
    // For the following examples, bit3 of the 8-bit value becomes the
    // LSB of the final 5-bit value (after rounding).
    // Notes: 
    //   * bit7 of the 8-bit value is the MSB (sign bit)
    //   * the 3 LSBs of the 8-bit value (bit2, bit1, bit0) are treated as
    //     guard, round and sticky bits.
    //   * See http://pages.cs.wisc.edu/~david/courses/cs552/S12/handouts/guardbits.pdf
    
    ap_fixed<8,3> p1 = 1.59375; // p1 = 001.10011
    ap_fixed<5,3,AP_RND_CONV> rconv1 = p1; // rconv1 = 1.5 (001.10)
    
    ap_fixed<8,3> p2 = 1.625; // p2 = 001.10100 => tie with bit3 (LSB-to-be) = 0
    ap_fixed<5,3,AP_RND_CONV> rconv2 = p2; // rconv2 = 1.5 (001.10) => lsb is already zero, just truncate
    
    ap_fixed<8,3> p3 = 1.375; // p3 = 001.01100 => tie with bit3 (LSB-to-be) = 1
    ap_fixed<5,3,AP_RND_CONV> rconv3 = p3; // rconv3 = 1.5 (001.10) => lsb is made zero by rounding up
    
    ap_fixed<8,3> p3 = 1.65625; // p3 = 001.10101
    ap_fixed<5,3,AP_RND_CONV> rconv3 = p3; // rconv3 = 1.75 (001.11) => round up
    

AP_TRN

  • Always round the value towards minus infinity.
    ap_fixed<3, 2, AP_TRN, AP_SAT> UAPFixed4 = 1.25; // Yields: 1.0
    ap_fixed<3, 2, AP_TRN, AP_SAT> UAPFixed4 = -1.25; // Yields: -1.5

AP_TRN_ZERO

Round the value to:

  • For positive values, the rounding is the same as mode AP_TRN.
  • For negative values, round towards zero.
    ap_fixed<3, 2, AP_TRN_ZERO, AP_SAT> UAPFixed4 = 1.25; // Yields: 1.0
    ap_fixed<3, 2, AP_TRN_ZERO, AP_SAT> UAPFixed4 = -1.25; // Yields: -1.0