If, before quantization, there is a
trained float model and some Python scripts to evaluate accuracy/mAP of the model, the
Quantizer API replaces the float module with a quantized module. The normal evaluate
function encourages quantized module forwarding. Quantize calibration determines
quantization steps of tensors in evaluation process if flag quant_mode is set to
"calib". After calibration, evaluate the quantized model by setting quant_mode to
"test".
- Import the vai_q_pytorch
module.
from pytorch_nndct.apis import torch_quantizer, dump_xmodel
- Generate a quantizer with quantization needed input and get the converted
model.
input = torch.randn([batch_size, 3, 224, 224]) quantizer = torch_quantizer(quant_mode, model, (input)) quant_model = quantizer.quant_model
- Forward a neural network with the converted
model.
acc1_gen, acc5_gen, loss_gen = evaluate(quant_model, val_loader, loss_fn)
- Output the quantization result and deploy the
model.
if quant_mode == 'calib': quantizer.export_quant_config() if deploy: quantizer.export_xmodel())