Generating an OFA Model - 2.5 English - UG1333

Vitis AI Optimizer User Guide (UG1333)

Document ID
UG1333
Release Date
2022-06-15
Version
2.5 English

Call ofa_model() to get an OFA model. This method finds all the nn.Conv2d / nn.ConvTranspose2dand nn.BatchNorm2d modules, then replaces those modules with DynamicConv2d / DynamicConvTranspose2d and DynamicBatchNorm2d.

A list of pruning ratio is required to specify what the OFA model will be.

For each convolution layer in the OFA model, an arbitrary pruning ratio can be used in the output channel. The maximum and minimum values in this list represent the maximum and minimum compression rates of the model. Other values in the list represent the subnetworks to be optimized. By default, the pruning ratio is set to [0.5, 0.75, 1].

For a subnetwork sampled from the OFA model, the out channels of a convolution layer is one of the numbers in the pruning ratio list multiplied by its original number. For example, for a pruning ratio list of [0.5, 0.75, 1] and a convolution layer nn.Conv2d(16, 32, 5), the out channels of this layer in a sampled subnetwork is one of [0.5*32, 0.75*32, 1*32].

Because the first and last layers have a significant impact on network performance, they are commonly excluded from pruning. By default, this method automatically identifies the first convolution and the last convolution, then puts them into the list of excludes. Setting auto_add_excludes equals False can cancel this feature.

ofa_model = ofa_pruner.ofa_model([0.5, 0.75, 1], excludes = None, auto_add_excludes=True)