VitisQuantizer.dump_model
API of vai_q_tensorflow2
to dump the simulation results with the quantized
model.from tensorflow_model_optimization.quantization.keras import vitis_quantize
quantized_model = keras.models.load_model('./quantized_model.h5') vitis_quantize.VitisQuantizer.dump_model(model=quantized_model,
dataset=dump_dataset,
output_dir='./dump_results')
batch_size
of the
dump_dataset should be set to the same batch_size on target device for DPU debugging. It
is recommended to use CPU simulation results for DPU debugging since GPU results can be
non-deterministic and slightly different for float value computation.Dump results are generated in ${dump_output_dir} after the command has successfully executed. Results for weights and activation of each layer are saved separately in the folder. For each quantized layer, results are saved in *.bin and *.txt formats. If the output of the layer is not quantized (such as for the softmax layer), the float activation results are saved in the *_float.bin and *_float.txt files. The / symbol is replaced by _ for simplicity. Examples for dumping results are shown in the following table.
Batch No. | Quantized | Layer Name | Saved files | ||
---|---|---|---|---|---|
Weights | Biases | Activation | |||
1 | Yes | resnet_v1_50/conv1 |
{output_dir}/dump_results_weights/quant_resnet_v1_50_conv1_kernel.bin {output_dir}/dump_results_weights/quant_resnet_v1_50_conv1_kernel.txt |
{output_dir}/dump_results_weights/quant_resnet_v1_50_conv1_bias.bin {output_dir}/dump_results_weights/quant_resnet_v1_50_conv1_bias.txt |
{output_dir}/dump_results_0/quant_resnet_v1_50_conv1.bin {output_dir}/dump_results_0/quant_resnet_v1_50_conv1.txt |
2 | No | resnet_v1_50/softmax | N/A | N/A |
{output_dir}/dump_results_0/quant_resnet_v1_50_softmax_float.bin {output_dir}/dump_results_0/quant_resnet_v1_50_softmax_float.txt |