When using TensorFlow 1.x to train a model, the process creates a folder that includes a GraphDef file (typically with a .pb or .pbtxt extension) and a set of checkpoint files. You need a single GraphDef file that has been frozen or had its variables converted into inline constants for mobile or embedded deployment so everything is in one file. To handle the conversion, TensorFlow provides freeze_graph.py, which is automatically installed with the vai_q_tensorflow quantizer.
The following is an example of command-line usage:
[docker] $ freeze_graph \
--input_graph /tmp/inception_v1_inf_graph.pb \
--input_checkpoint /tmp/checkpoints/model.ckpt-1000 \
--input_binary true \
--output_graph /tmp/frozen_graph.pb \
--output_node_names InceptionV1/Predictions/Reshape_1
`
The –input_graph
should be an inference graph
other than the training graph. Because the operations of data preprocessing and loss
functions are not required for inference and deployment, the frozen_graph.pb should only include the essential components of the
model. Particularly, the Input_fn
should take in the
data pre-processing operations to generate correct input data for post-training
quantization.
is_training=false
when using tf.layers.dropout
/tf.layers.batch_normalization
. For models using tf.keras
, call tf.keras.backend.set_learning_phase(0)
before building the graph.freeze_graph --help
for more options.The input and output node names vary depending on the model, but you can inspect and estimate them with the vai_q_tensorflow quantizer. See the following example code snippet:
[docker] $ vai_q_tensorflow inspect --input_frozen_graph=/tmp/inception_v1_inf_graph.pb
The estimated input and output nodes cannot be used for quantization if the graph has in-graph pre- and post-processing. This is because some operations cannot be quantized and can cause errors when you compile the model with the Vitis AI compiler and deploy it to the DPU.
Another way to get the input and output names of the graph is by visualizing the graph. Both TensorBoard and Netron can do this. See the following example that uses Netron:
[docker] $ pip install netron
[docker] $ netron /tmp/inception_v3_inf_graph.pb