vai_q_pytorch Usage - 1.3 English

Vitis AI User Guide (UG1414)

Document ID
UG1414
Release Date
2021-02-03
Version
1.3 English

This section introduces the usage of execution tools and APIs to implement quantization and generate a model to be deployed on the target hardware. The APIs in the module pytorch_binding/pytorch_nndct/apis/quant_api.py are as follows:

class torch_quantizer(): 
  def __init__(self,
               quant_mode: str, # ['calib', 'test']
               module: torch.nn.Module,
               input_args: Union[torch.Tensor, Sequence[Any]] = None,
               state_dict_file: Optional[str] = None,
               output_dir: str = "quantize_result",
               bitwidth: int = 8,
               device: torch.device = torch.device("cuda"),
               qat_proc: bool = False): 

Class torch_quantizer will create a quantizer object.

Arguments:

quant_mode
An integer that indicates which quantization mode the process is using. "calib" for calibration of quantization, and "test" for evaluation of quantized model.
Module
Float module to be quantized.
Input_args
Input tensor with the same shape as real input of float module to be quantized, but the values can be random numbers.
State_dict_file
Float module pretrained parameters file. If float module has read parameters in, the parameter is not needed to be set.
Output_dir
Directory for quantization result and intermediate files. Default is “quantize_result”.
Bitwidth
Global quantization bit width. Default is 8.
Device
Run model on GPU or CPU.
Qat_proc
Turn on quantize finetuning, also named quantization-aware-training (QAT).
def export_quant_config(self):

This function exports quantization steps information

def export_xmodel(self, output_dir, deploy_check):

This function export xmodel and dump operators' output data for detailed data comparison

Arguments:

Output_dir
Directory for quantization result and intermediate files. Default is “quantize_result”.
Deploy_check
Flags to control dump of data for detailed data comparison. Default is False. If it is set to True, binary format data will be dumped to output_dir/deploy_check_data_int/.