The parameters that are set to zero in the pruned model are removed from the sparse model. There are two ways to generate a final pruned model.
Using a Pruning API
method = 'iterative' # or 'one_step'
runner = get_pruning_runner(model, input_signature, method)
slim_model = runner.prune(removal_ratio=0.2, mode='slim')
slim_model.load_state_dict(torch.load('model_pruned.pth'))
Without Using a Pruning API
This approach is often used to quantize pruned models as sometimes there can be no way to call the pruning API.
from pytorch_nndct.utils import slim
model = create_your_baseline_model()
slim_model = slim.load_state_dict(model, torch.load('model_pruned.pth'))