TensorFlow Version (vai_q_tensorflow) - 1.1 English

Vitis AI User Guide (UG1414)

Document ID
Release Date
1.1 English
Use the following steps to Run vai_q_tensorflow.
  1. Prepare Float Model: Before running vai_q_tensorflow, prepare the frozen inference tensorflow model in floating-point format and calibration set, including the files listed in the following table.
    Table 1. Input Files for vai_q_tensorflow
    No. Name Description
    1 frozen_graph.pb Floating-point frozen inference graph. Ensure that the graph is the inference graph rather than the training graph.
    2 calibration dataset A subset of the training dataset containing 100 to 1000 images.
    3 input_fn An input function to convert the calibration dataset to the input data of the frozen_graph during quantize calibration. Usually performs data preprocessing and augmentation.

    For more information, see Getting the Frozen Inference Graph, Getting the Calibration Dataset and Input Function, and Custom Input Function.

  2. Run vai_q_tensorflow: the following commands to quantize the model:
    $vai_q_tensorflow quantize \
    					--input_frozen_graph  frozen_graph.pb \
    					--input_nodes  ${input_nodes} \
    					--input_shapes  ${input_shapes} \
    					--output_nodes   ${output_nodes} \
    					--input_fn  input_fn \

    For more information, see Setting the --input_nodes and --output_nodes and Setting the Options.

  3. After successful execution of the above command, two files are generated in ${output_dir}:
    • quantize_eval_model.pb is used to evaluate on CPU/GPUs, and can be used to simulate the results on hardware. You need to run import tensorflow.contrib.decent_q explicitly to register the custom quantize operation, because tensorflow.contrib is now lazily loaded.
    • deploy_model.pb is used to compile the DPU codes and deploy on it, which can be used as the input files to the Vitis AI compiler.
    Table 2. vai_q_tensorflow Output Files
    No. Name Description
    1 deploy_model.pb Quantized model for VAI compiler (extended Tensorflow format)
    2 quantize_eval_model.pb Quantized model for evaluation
  4. After deployment of the quantized model, sometimes it is necessary to compare the simulation results on the CPU/GPU and the output values on the DPU. vai_q_tensorflow supports dumping the simulation results with the quantize_eval_model.pb generated in step 3.

    Run the following commands to dump the quantize simulation results:

    $vai_q_tensorflow dump \
    					--input_frozen_graph  quantize_results/quantize_eval_model.pb \
    					--input_fn  dump_input_fn \
    					--max_dump_batches 1 \
    					--dump_float 0 \
    					--output_dir quantize_reuslts \

    The input_fn for dumping is similar to the input_fn for quantize calibration, but the batch size is often set to 1 to be consistent with the DPU results.

    After successful execution of the above command, dump results are generated in ${output_dir}. There are folders in ${output_dir}, and each folder contains the dump results for a batch of input data. In the folders, results for each node are saved separately. For each quantized node, results are saved in *_int8.bin and *_int8.txt format. If dump_float is set to 1, the results for unquantized nodes are dumped. The / symbol is replaced by _ for simplicity. Examples for dump results are shown in the following table.

    Table 3. Examples for Dump Results
    Batch No. Quant Node Name Saved files
    1 Yes resnet_v1_50/conv1/biases/wquant {output_dir}/dump_results_1/resnet_v1_50_conv1_biases_wquant_int8.bin


    2 No resnet_v1_50/conv1/biases {output_dir}/dump_results_2/resnet_v1_50_conv1_biases.bin