Guidelines for Better Training Results - 3.5 English

Vitis AI User Guide (UG1414)

Document ID
Release Date
3.5 English

The following are some tips to improve training results:

  • If possible, load the pre-trained floating-point weights as initial values to start the quantization aware training. It is possible to train from scratch with random initial values, but it makes training more complex and lengthy.
  • If pre-trained floating-point weights are loaded, different initial learning rates and learning rate decrease strategies must be used for the network and quantizer parameters, respectively. In general, the learning rate of network parameters must be set small, while the learning rate of quantizer parameters needs to be larger.
    model = qat_processor.trainable_model()
    param_groups = [{
        'params': model.quantizer_parameters(),
        'lr': 1e-2,
        'name': 'quantizer'
    }, {
        'params': model.non_quantizer_parameters(),
        'lr': 1e-5,
        'name': 'weight'
    optimizer = torch.optim.Adam(param_groups)
  • For the choice of the optimizer, avoid using torch.optim.SGD, as this optimizer can prevent the training from converging. AMD recommends using torch.optim.Adam or torch.optim.RMSprop and their variants.