vai_q_tensorflow2 Quantize Finetuning - 1.3 English

Vitis AI User Guide (UG1414)

Document ID
UG1414
Release Date
2021-02-03
Version
1.3 English
Generally, there is a small accuracy loss after quantization but for some networks such as MobileNets, the accuracy loss can be large. In this situation, quantize finetuning can be used to further improve the accuracy of quantized models.

Technically, quantize finetuning is similar to float model finetuning. The difference is that quantize finetuning uses the APIs of the vai_q_tensorflow2 to rewrite the float graph to convert it to a quantized model before the training starts. The typical workflow is as follows:

  1. Preparation.

    Before finetuning, please prepare the following files:

    Table 1. Input Files for vai_q_tensorflow2 Quantize Finetuning
    No. Name Description
    1 Float model file Floating-point model files to start from. Can be omitted if training from scratch.
    2 Dataset The training dataset with labels.
    3 Training Scripts The Python scripts to run float train/finetuning of the model.
  2. (Optional) Evaluate the Float Model

    It is suggested to evaluate the float model first before doing quantize finetuning, which can check the correctness of the scripts and dataset. The accuracy and loss values of the float checkpoint can also be a baseline for the quantize finetuning.

  3. Modify the Training Scripts

    Use the vai_q_tensorflow2 API, VitisQuantizer.get_qat_model, to do the quantization. The following is an example:

    model = tf.keras.models.load_model(‘float_model.h5’)
    
    # *Call Vai_q_tensorflow2 api to create the quantize training model
    from tensorflow_model_optimization.quantization.keras import vitis_quantize
    quantizer = vitis_quantize.VitisQuantizer(model)
    model = quantizer.get_qat_model()
    
    # Compile the model
    model.compile(
    	optimizer= RMSprop(learning_rate=lr_schedule),		loss=tf.keras.losses.SparseCategoricalCrossentropy(),
    	metrics=keras.metrics.SparseTopKCategoricalAccuracy())
    
    # Start the training/finetuning
    model.fit(train_dataset)
    
  4. Save the Model

    Call model.save() to save the trained model or use callbacks in model.fit() to save the model periodically. For example:

    # save model manually
    model.save(‘trained_model.h5’)
    
    # save the model periodically during fit using callbacks
    model.fit(
    	train_dataset, 
    	callbacks = [
          		keras.callbacks.ModelCheckpoint(
              	filepath=’./quantize_train/’
              	save_best_only=True,
              	monitor="sparse_categorical_accuracy",
              	verbose=1,
          )])
    
  5. (Optional) Evaluate the Quantized Model

    Call model.evaluate() on the eval_dataset to evaluate the quantized model, just like evaluation of the float model.

    Note: Quantize finetuning works like float finetuning, so it will be of great help to have some experience on float model training and finetuning. For example, how to choose hyper-parameters like optimizer type and learning rate.