Many Vitis Model Composer HDL blocks across various libraries support the floating-point data type.
Model Composer uses the Floating-Point Operator v7.1 IP core to leverage implementation of operations such as addition/subtraction, multiplication, comparisons, and data type conversion.
The floating-point data type support is in compliance with IEEE-754 Standard for Floating-Point Arithmetic. Model Composer supports single, double, and custom precision floating-point data types for design input, data type display, and data rate and type propagation across supported HDL blocks.
IEEE-754 Standard for Floating-Point Data Type
One Sign bit (S), X exponent bits, and Y fraction bits represents floating-point data, as shown in the following figure. The Sign bit is always the most-significant bit (MSB).
According to the IEEE-754 standard, a floating-point value is represented and stored in the normalized form. In the normalized form the exponent value E is a biased/normalized value. The normalized exponent, E, equals the sum of the actual exponent value and the exponent bias. In the normalized form, Y-1 bits store the fraction value. The F0 fraction bit is always hidden and its value is assumed to be 1.
S represents the value of the sign of the number. If S is 0 then the value is a positive floating-point number; otherwise it is negative. The X bits that follow store the normalized exponent value E. The last Y-1 bits store the fraction/mantissa value in the normalized form.
For the given exponent width, the exponent bias is calculated as follows:
Where X is the exponent bit width.
According to the IEEE standard, a single precision floating-point data is represented using 32 bits. The normalized exponent and fraction/mantissa are allocated 8 and 24 bits, respectively. The exponent bias for single precision is 127. Similarly, a double precision floating-point data is represented using a total of 64 bits where the exponent bit width is 11 and the fraction bit width is 53. The exponent bias value for double precision is 1023.
The normalized floating-point number in the equation form is represented as follows:
The actual value of exponent (E_actual) = E - Exponent_bias. Considering 1 as the value for the hidden bit F0 and the E_actual value, a floating-point number can be calculated as follows:
Floating-Point Data Representation in Model Composer
The HDL Gateway In block supports Boolean, Fixed-point, and Floating-point data types as shown in the following figure. You can select either a Single, Double, or Custom precision type after specifying the floating-point data type.
For example, if an Exponent width of 9 and a Fraction width of 31 is specified, then the floating-point data value is stored in total 40 bits where the MSB bit is for sign representation, the following 9 bits store the biased exponent value and the 30 LSB bits store the fractional value.
In compliance with the IEEE-754 standard, if you select Single precision, the total bit width is assumed to be 32; 8 bits for the exponent and 24 bits for the fraction. Similarly when you select Double precision, the total bit width is assumed to be 64 bits; 11 bits for the exponent and 53 bits for the fraction part. When you select Custom precision, the Exponent width and Fraction width fields activate and you can specify values for these fields (8 and 24 are the default values). The total bit width for Custom precision data is the summation of the number of exponent bits and the number of fraction bits. Similar to fraction bit width for Single precision and Double precision data types the fraction bit width for Custom precision data type must include the hidden bit F0.
Displaying the Data Type on Output Signals
As the following figure shows, after successful rate and type propagation, the floating-point data type displays on the output of each HDL block. To display the signal data type as shown in the following figure, select .
A floating-point data type uses the format: XFloat_<exponent_bit_width>_<fraction_bit_width>.
Single and Double precision data types use the string "XFloat_8_24" and "XFloat_11_53", respectively.
If for a Custom precision data type you specify the exponent bit
width 9 and the fraction bit width 31, then it displays as "XFloat_9_31". A total of 40 bits are used to store the floating-point
data value. Because floating-point data is stored in a normalized form, the
fractional value is stored in 30 bits.
In Model Composer the fixed-point data type uses the format XFix_<total_data_width>_<binary_point_width>. For example,
a fixed-point data type with the data width of 40 and binary point width of 31 is
displays as XFix_40_31.
Vitis Model Composer uses the exponent bit width and the fraction bit width to configure and generate an instance of the Floating-Point Operator core.
Rate and Type Propagation
During data rate and type propagation across a Model Composer HDL block that supports floating-point data, the following design rules are verified. The tool issues an appropriate error if one of the following violations is detected:
- If a signal carrying floating-point data is connected to the port of an HDL block that doesn't support the floating-point data type.
- If the data input (both A and B data inputs, where applicable)
and the data output of an HDL block are not of the same floating-point data
type. The DRC check are made between the two inputs of a block as well as
between an input and an output of the block.
If a Custom precision floating-point data type is specified, the exponent bit width and the fraction bit width of the two ports are compared to determine that they are of the same data type.
Note: The Convert and Relational blocks are excluded from this check. The Convert block supports Float-to-float data type conversion between two different floating-point data types. The Relational block output is always the Boolean data type because it gives a true or false result for a comparison operation. - If the data inputs are of the fixed-point data type and the data output is
expected to be floating-point and vice versa.Note: The Convert and Relational blocks are excluded from this check. The Convert block supports Fixed-to-float as well as Float-to-fixed data type conversion. The Relational block output is always the Boolean data type because it gives a true or false result for a comparison operation.
- If Custom precision is selected for the Output Type of blocks that support the floating-point data type. For example, for blocks such as AddSub, Mult, CMult, and MUX, only Full output precision is supported if the data inputs are of the floating-point data type.
- If the Carry In port or Carry Out port is used for the AddSub block when the operation on a floating-point data type is specified.
- If the Floating-Point Operator IP core gives an error for DRC rules defined for the IP.