Using the New Data Format - 3.5 English

Vitis AI User Guide (UG1414)

Document ID
UG1414
Release Date
2023-09-28
Version
3.5 English

vai_q_pytorch introduces a new data format called block floating point (BFP). In BFP, numbers within a block share the same exponent, determined by the largest exponent in the block. Smaller numbers have their mantissa shifted right to accommodate this shared exponent.

Although you can use vai_q_pytorch to assess the quantization results, there is currently no option to deploy the quantized model to hardware.

Note: BFP is a new data format not fully supported in the current version of the Vitis AI toolchain.

Usage

BFP offers various configurations, including bit width, block size, and more. There are two out-of-the-box configuration types ("mx6" and "mx9") that you can use directly without having to set up their configuration items. To quantize the model, follow these steps:

  • Preparing the float model and inputs:
    model = buid_your_model()
    batch_size = 32
    inputs = torch.randn([batch_size, 3, 224, 224], dtype=torch.float32)
  • Quantizing the float model:
    from pytorch_nndct import bfp
    quantized_model = bfp.quantize_model(model, inputs, dtype='mx6')
  • Validating the quantized model: Pass the quantized model to the validation function to evaluate quantization results:
    validate(quantized_model, data_loader)

BFP APIs

bfp.quantize_model(model, inputs, dtype='mx6', config_file=None)
Model
Float module to be quantized.
Inputs
The input tensor should have the same shape as the actual input of the floating-point module to be quantized, but the values can be random.
dtype
Pre-configured BFP configuration. Available values are mx6 and mx9.
config_file
Configuration file path. This feature is under development. Use the pre-defined dtype.