The following codes show how to perform post-training quantization and export the quantized
model to onnx with vai_q_tensorflow2
API.
model = tf.keras.models.load_model(‘float_model.h5’)
from tensorflow_model_optimization.quantization.keras import vitis_quantize
quantizer = vitis_quantize.VitisQuantizer(model)
quantized_model = quantizer.quantize_model(calib_dataset=calib_dataset,
output_format='onnx',
onnx_opset_version=11,
output_dir='./quantize_results',
**kwargs)
- output_format
- A string object, indicates what format to save the quantized model. Options are: '' for skip saving, 'h5' for saving .h5 file, 'tf' for saving saved_model file, 'onnx' for saving .onnx file. Default to ''.
- onnx_opset_version
- An int object, the ONNX opset version. Take effect only when output_format is 'onnx'. Default to 11.
- output_dir
- A string object, indicates the directory to save the quantized model in. Default to './quantize_results'.