The following are some tips for QAT.
- Keras Model
-
For Keras models, set
backend.set_learning_phase(1)
before creating the float train graph, and setbackend.set_learning_phase(0)
before creating the float evaluation graph. Moreover,backend.set_learning_phase()
should be called afterbackend.clear_session()
. Tensorflow1.x QAT APIs are designed for TensorFlow native training APIs. Using Kerasmodel.fit()
APIs in QAT might lead to some nodes not executed issues. It is recommended to use QAT APIs in the Tensorflow2 quantization tool with Keras APIs. - Dropout
- Experiments show that QAT works better without dropout ops. This tool does
not support finetuning with dropouts, and they should be removed or disabled
before running QAT. This can be achieved by setting
is_training=false
when using tf.layers, or by callingtf.keras.backend.set_learning_phase(0)
when using tf.keras.layers. - Hyper-parameter
- QAT is like floating-point model training/finetuning, so the techniques used in those are required in QAT. The optimizer type and the learning rate curve are important parameters to tune.