You can use module partial quantization if not all the sub-modules in a
model need to be quantized. Besides using general vai_q_pytorch APIs, you can use the
QuantStub/DeQuantStub
operator pair to realize it.
The following example demonstrates how to quantize subm0
and subm2
but not quantize subm1
.
from pytorch_nndct.nn import QuantStub, DeQuantStub
class WholeModule(torch.nn.module):
def __init__(self,...):
self.subm0 = ...
self.subm1 = ...
self.subm2 = ...
# define QuantStub/DeQuantStub submodules
self.quant = QuantStub()
self.dequant = DeQuantStub()
def forward(self, input):
input = self.quant(input) # begin of part to be quantized
output0 = self.subm0(input)
output0 = self.dequant(output0) # end of part to be quantized
output1 = self.subm1(output0)
output1 = self.quant(output1) # begin of part to be quantized
output2 = self.subm2(output1)
output2 = self.dequant(output2) # end of part to be quantized