vai_q_pytorch QAT Requirements

vai_q_pytorch QAT Requirements - 2.0 English

Vitis AI User Guide (UG1414)

Document ID

UG1414

Release Date

2022-01-20

Version

2.0 English

Generally, there is a small accuracy loss after quantization, but for some networks such as MobileNets, the accuracy loss can be large. In this situation, first try fast finetune. If fast finetune does not yield satisfactory results, QAT can be used to further improve the accuracy of the quantized models.

The QAT APIs have some requirements for the model to be trained.

All operations to be quantized must be instances of the torch.nn.Module object, rather than Torch functions or Python operators. For example, it is common to use ‘+’ to add two tensors in PyTorch. However, this is not supported in QAT. Thus, replace ‘+’ with pytorch_nndct.nn.modules.functional.Add. Operations that need replacement are listed in the following table.

Table 1. Operation-Replacement Mapping
Operation	Replacement
`+`	`pytorch_nndct.nn.modules.functional.Add`
`-`	`pytorch_nndct.nn.modules.functional.Sub`
`torch.add`	`pytorch_nndct.nn.modules.functional.Add`
`torch.sub`	`pytorch_nndct.nn.modules.functional.Sub`

Important: A module to be quantized cannot be called multiple times in the forward pass.

Use pytorch_nndct.nn.QuantStub and pytorch_nndct.nn.DeQuantStub at the beginning and end of the network to be quantized. The network can be the complete network or a partial sub-network.