vai_q_tensorflow2 Quantize Finetuning

vai_q_tensorflow2 Quantize Finetuning - 1.3 English

Vitis AI User Guide (UG1414)

Document ID

UG1414

Release Date

2021-02-03

Version

1.3 English

Generally, there is a small accuracy loss after quantization but for some networks such as MobileNets, the accuracy loss can be large. In this situation, quantize finetuning can be used to further improve the accuracy of quantized models.

Technically, quantize finetuning is similar to float model finetuning. The difference is that quantize finetuning uses the APIs of the vai_q_tensorflow2 to rewrite the float graph to convert it to a quantized model before the training starts. The typical workflow is as follows:

Preparation.

Before finetuning, please prepare the following files:

Table 1. Input Files for vai_q_tensorflow2 Quantize Finetuning
No.	Name	Description
1	Float model file	Floating-point model files to start from. Can be omitted if training from scratch.
2	Dataset	The training dataset with labels.
3	Training Scripts	The Python scripts to run float train/finetuning of the model.

(Optional) Evaluate the Float Model
It is suggested to evaluate the float model first before doing quantize finetuning, which can check the correctness of the scripts and dataset. The accuracy and loss values of the float checkpoint can also be a baseline for the quantize finetuning.

Modify the Training Scripts

Use the vai_q_tensorflow2 API, VitisQuantizer.get_qat_model, to do the quantization. The following is an example:

model = tf.keras.models.load_model(‘float_model.h5’)

# *Call Vai_q_tensorflow2 api to create the quantize training model
from tensorflow_model_optimization.quantization.keras import vitis_quantize
quantizer = vitis_quantize.VitisQuantizer(model)
model = quantizer.get_qat_model()

# Compile the model
model.compile(
	optimizer= RMSprop(learning_rate=lr_schedule),		loss=tf.keras.losses.SparseCategoricalCrossentropy(),
	metrics=keras.metrics.SparseTopKCategoricalAccuracy())

# Start the training/finetuning
model.fit(train_dataset)

Save the Model

Call model.save() to save the trained model or use callbacks in model.fit() to save the model periodically. For example:

# save model manually
model.save(‘trained_model.h5’)

# save the model periodically during fit using callbacks
model.fit(
	train_dataset, 
	callbacks = [
      		keras.callbacks.ModelCheckpoint(
          	filepath=’./quantize_train/’
          	save_best_only=True,
          	monitor="sparse_categorical_accuracy",
          	verbose=1,
      )])

(Optional) Evaluate the Quantized Model
Call model.evaluate() on the eval_dataset to evaluate the quantized model, just like evaluation of the float model.

Note: Quantize finetuning works like float finetuning, so it will be of great help to have some experience on float model training and finetuning. For example, how to choose hyper-parameters like optimizer type and learning rate.