The following code shows how to do post-training quantization with vai_q_tensorflow2 API. You can find a full example here.
float_model = tf.keras.models.load_model(‘float_model.h5’)
from tensorflow_model_optimization.quantization.keras import vitis_quantize
quantizer = vitis_quantize.VitisQuantizer(float_model)
quantized_model = quantizer.quantize_model(calib_dataset=calib_dataset, calib_step=100, calib_batch_size=10)
- calib_dataset
-
"calib_dataset"
is used as a representative calibration dataset for calibration. You can use full or part of theeval_dataset
,train_dataset
, or other datasets. - calib_steps
- calib_steps is the total number of steps for calibration. It has a default value
of None. If "calib_dataset" is a
tf.data dataset
, generator, orkeras.utils.Sequence
instance and steps is None, calibration will run until the dataset is exhausted. This argument is not supported with array inputs. - calib_batch_size
- calib_batch_size is the number of samples per batch for calibration. If the
"calib_dataset" is in the form of a dataset, generator, or
keras.utils.Sequence
instances, the batch size is controlled by the dataset itself. If the "calib_dataset" is in the form of anumpy.array
object, the default batch size is 32.