vitis_quantize.VitisQuantizer.quantize_model

vitis_quantize.VitisQuantizer.quantize_model - 2.0 English

Vitis AI User Guide (UG1414)

Document ID

UG1414

Release Date

2022-01-20

Version

2.0 English

This function performs the post-training quantization (PTQ) of the float model, including model optimization, weights quantization, and activation quantize calibration.


vitis_quantize.VitisQuantizer.quantize_model(
    calib_dataset=None,
    calib_batch_size=None,
    calib_steps=None,
    verbose=0,
    fold_conv_bn=True,
    fold_bn=True,
    replace_sigmoid=True,
    replace_relu6=True,
    include_cle=True,
    cle_steps=10,
    forced_cle=False,
    include_fast_ft=False,
    fast_ft_epochs=10,
    add_shape_info=False)

Arguments

calib_dataset: A tf.data.Dataset, keras.utils.Sequence, or np.numpy object, the representative dataset for calibration. You can use full or part of eval_dataset, train_dataset, or other datasets as calib_dataset.
calib_steps: An int object, the total number of steps for calibration. Ignored with the default value of None. If "calib_dataset" is a tf.data dataset, generator, or keras.utils.Sequence instance and steps is None, calibration will run until the dataset is exhausted. This argument is not supported with array inputs.
calib_batch_size: An int object, the number of samples per batch for calibration. If the "calib_dataset" is in the form of a dataset, generator, or keras.utils.Sequence instances, the batch size is controlled by the dataset itself. If the "calib_dataset" is in the form of a numpy.array object, the default batch size is 32.
fold_conv_bn: A bool object, whether to fold the batch norm layers into previous Conv2D/DepthwiseConv2D/TransposeConv2D/Dense layers.
fold_bn: A bool object whether to convert the standalone batch norm layer into DepthwiseConv2D layers.
replace_sigmoid: A bool object, whether to replace the Activation(activation='sigmoid') layers into hard sigmoid layers and do quantization. If not, the sigmoid layers will be left unquantized and will be scheduled on CPU.
replace_relu6: A bool object, whether to replace the ReLU6 layers with ReLU layers.
include_cle: A bool object, whether to do Cross-Layer Equalization before quantization.
cle_steps: A int object, the iteration steps to do Cross-Layer Equalization.
forced_cle: A bool object, whether to do forced Cross-Layer Equalization for ReLU6 layers.
include_fast_ft: A bool object, whether to do fast fine-tuning or not. Fast fine-tuning adjust the weights layer by layer with calibration dataset and may get better accuracy for some models. Fast fine-tuning is disabled by default. It takes longer than normal PTQ (still much shorter than QAT as calib_dataset is much smaller than the training dataset). Turn on to improve the performance if you meet accuracy issues.
fast_ft_epochs: An int object, the iteration epochs to do fast fine-tuning for each layer.
add_shape_info: An bool object, whether to add shape inference information for custom layers. Must be set True for models with custom layers.