vai_q_pytorch QAT Requirements - 3.5 English

Vitis AI User Guide (UG1414)

Document ID
Release Date
3.5 English

Generally, quantization can cause a slight loss of accuracy, but for certain networks like MobileNets, the accuracy loss can be more significant. In such cases, it is recommended to try fast fine-tuning first. If fast fine-tuning fails to produce satisfactory results, Quantization-Aware Training (QAT) can further enhance the accuracy of the quantized model.

However, specific requirements exist for the model to be trained using QAT APIs. All operations to be quantized must be instances of the torch.nn.Module object rather than Torch functions or Python operators. For instance, using '+' to add two tensors in PyTorch is common but not supported in QAT. Instead, replace '+' with pytorch_nndct.nn.modules.functional.Add. The operations that require replacement are listed in the following table.

Table 1. Operation Replacement Mapping
Operation Replacement
+ pytorch_nndct.nn.modules.functional.Add
- pytorch_nndct.nn.modules.functional.Sub
torch.add pytorch_nndct.nn.modules.functional.Add
torch.sub pytorch_nndct.nn.modules.functional.Sub
Important: A module to be quantized cannot be called multiple times in the forward pass due to conflicting quantization information.
Use pytorch_nndct.nn.QuantStub and pytorch_nndct.nn.DeQuantStub at the beginning and end of the network to be quantized. The network can be a complete or a partial sub-network.