Iterative Pruning - 3.0 English

Vitis AI Optimizer User Guide (UG1333)

Document ID
UG1333
Release Date
2023-01-12
Version
3.0 English

The pruner is designed to reduce the number of model parameters while minimizing the accuracy loss. This is done iteratively as shown in the following figure. Pruning results in accuracy loss while retraining recovers accuracy. Pruning, followed by retraining, forms one iteration. In the first iteration of pruning, the input model is the baseline model, and it is pruned and fine-tuned. In subsequent iterations, the fine-tuned model obtained from the previous iteration becomes the new baseline. This process is usually repeated several times until a desired sparse model is obtained. The iterative approach is required because a model cannot be pruned in a single pass while maintaining accuracy. When too many parameters are removed in one iteration, the accuracy loss may become too steep and recovery may not be possible.

Important: The parameters are progressively reduced at each iteration to improve accuracy during the fine-tuning stage.

Leveraging the process of iterative pruning, higher pruning rates can be achieved without any significant loss of model performance.

Figure 1. Iterative Pruning

The four primary stages in iterative pruning are as follows:

Analysis
Perform a sensitivity analysis on the model to determine the optimal pruning strategy.
Pruning
Reduce the number of computations in the input model.
Fine-tuning
Retrain the pruned model to recover accuracy.
Transformation
Generate a dense model with reduced weights.