Suppose there is a pre-trained float
model and Python scripts for evaluating its accuracy/mAP before quantization. In that
case, the Quantizer API substitutes the floating-point module with a quantized module.
For standard evaluation, the evaluate function promotes the forwarding of the quantized
module. By configuring the quant_mode flag to caliber, you can determine the quantization
steps of tensors during the evaluation process for post-training quantization. After
calibration is complete, evaluate the quantized model by setting quant_mode to
test.
- Import the vai_q_pytorch
module:
from pytorch_nndct.apis import torch_quantizer, dump_xmodel
- Generate a quantizer with quantization needed input and get the converted
model:
input = torch.randn([batch_size, 3, 224, 224]) quantizer = torch_quantizer(quant_mode, model, (input)) quant_model = quantizer.quant_model
- Forward a neural network with the converted
model:
acc1_gen, acc5_gen, loss_gen = evaluate(quant_model, val_loader, loss_fn)
- Output the quantization result and deploy the
model:
if quant_mode == 'calib': quantizer.export_quant_config() if deploy: quantizer.export_torch_script() quantizer.export_onnx_model() quantizer.export_xmodel(deploy_check=False)