For truncation the probability density function (PDF) of the noise is:
therefore the mean and the variance of the error introduced are:
Implementing truncation has no cost in hardware; the fractional bits are simply trimmed.
For rounding the PDF of the noise is:
the mean and the variance of the error introduced are:
Therefore, the ideal rounder introduces no DC bias to the signal flow. If the full product word (for example, arbr - aibi) is represented with BP bits, and the actual result of the core (for example, pr) is represented with BR bits, then bits BP-1...BP-BR are the integer part, and BP-BR-1..0 are the fractional part of the result.
To implement the rounding function shown in Figure 2, 0.5 (represented in BP.BP-BR format) has to be added to the full product word, then the lower BP-BR bits need to be truncated. However, if the fractional part is exactly 0.5, this method always rounds up, which introduces positive bias to the computation. Also, if the rounding constant is -1 (see the preceding figure), 0.5 would be always rounded down, introducing negative bias.
If 0.5 is rounded using a static rule, the resulting quantization always introduces bias. To avoid bias, rounding must be randomized. Therefore, the core adds a rounding constant, and an extra 1 should be added with ½ probability, thus dithering the exact rounding threshold. Typical round carry sources being used extensively as control signals are listed in the following table.
| 0.5 Rounding Rule | Round Carry Source |
|---|---|
| Round towards 0 | -MSB(P) |
| Round towards +/- infinity | MSB(P) |
| Round towards nearest even | LSB(P) |
Rounding of the results is not trivial when
multiple, cascaded DSP Slices are involved in the process, such as evaluation of the preceding
two equations. The sign of the output (MSBo) cannot be predicted from the operands before the
actual multiplications and additions take place, and would incur additional latency or
resource to implement outside the DSP Slices. Therefore an external signal should be used to
feed the round carry input, through the ROUND_CY (Bit 0 of s_axis_ctrl_tdata) pin.
A good candidate for a source can be a clock-dividing flip-flop, or any 50% duty cycle random signal, which is not correlated with the fractional part of the results. For predictable behavior (as for bit-true modeling) the ROUND_CY signal might need to be connected to a CLK independent source in your design, such as an LSB of one of the complex multiplier inputs.
Nevertheless, even when a static rule is used (such as tying ROUND_CY = 0), bias and quantization error are reduced compared to using truncation.
In many cases, for DSP Slice implementation, the addition of the rounding constant is ’free’, as the C port and carry-in input can be used. In devices without DSP Slices, the addition of rounding typically requires an extra slice-based adder and an additional cycle of latency.