The following code shows how to perform post-training quantization and export
the quantized model to ONNX with vai_q_tensorflow2
API:
model = tf.keras.models.load_model('float_model.h5')
from tensorflow_model_optimization.quantization.keras import vitis_quantize
quantizer = vitis_quantize.VitisQuantizer(model)
quantized_model = quantizer.quantize_model(calib_dataset=calib_dataset,
output_format='onnx',
onnx_opset_version=11,
output_dir='./quantize_results',
**kwargs)
- output_format
-
String. Indicates the format to save
the quantized model. Options are:
- '' for skip saving
- h5 for saving .h5 file
- tf for saving the saved_model file
- onnx for saving the ONNX file
The default value is ''.
- onnx_opset_version
- Int. The ONNX opset version. It takes effect only when output_format is 'onnx.' The default value is 11.
- output_dir
- String. Indicates the directory to save the quantized model. The default value is './quantize_results.'