The pruning tool on PyTorch is a Python package and not an executable program. vai_p_pytorch provides three methods of model pruning:
Iterative and one-step pruning are suitable for networks with common convolution layers, but do not work very well on depthwise convolution based networks like MobileNet-v2. Convolutional neural networks (CNN) generally contain BatchNormalization layers and one-step pruning is preferred for these networks because it is faster and works better. If there are no BatchNormalization layers in the network, such as in VGGNet, then iterative pruning should be used.
OFA is applicable to both depthwise and common convolutions. It is important to know that OFA is theoretically the best of these three methods though it is not easy to get good pruning results. Because the result depends on how well the supernetwork can be trained, a long training time and strong training skills are required.
To summarize, if there are BatchNormalization layers in the network, use one-step pruning. Otherwise, use iterative pruning. If you are not satisfied with the pruning results, use OFA to get a better pruning model.