Generally, there is a small accuracy loss after quantization, but for some networks such as Mobilenets, the accuracy loss can be large. In this situation, quantize finetuning can be used to further improve the accuracy of quantized models.
APIs
There are 3 APIs for quantize finetuning in Python package tf.contrib.decent_q.
tf.contrib.decent_q.CreateQuantizeTrainingGraph(config)
Convert the float training graph to quantize training graph, this is done by in-place rewriting on the default graph.
Arguments:
- config: A
tf.contrib.decent_q.QuantizeConfig
object, containing the configurations for quantization.
tf.contrib.decent_q.CreateQuantizeEvaluationGraph(config)
Convert the float evaluation graph to quantize evaluation graph, this is done by in-place rewriting on the default graph.
Arguments:
- config: A
tf.contrib.decent_q.QuantizeConfig
object, containing the configurations for quantization.
tf.contrib.decent_q.CreateQuantizeDeployGraph(checkpoint, config)
Freeze the checkpoint into the quantize evaluation graph and convert the quantize evaluation graph to deploy graph.
Arguments:
- checkpoint: A
string
object, the path to checkpoint folder of file. - config: A
tf.contrib.decent_q.QuantizeConfig
object, containing the configurations for quantization.