Sometimes, after deploying the quantized model, it is essential to compare the simulation results on the CPU and GPU with the output values on the DPU.
You can use the dump_model API of vai_q_onnx to dump the simulation results with the quantized_model:
import vai_q_onnx
# This function dumps the simulation results of the quantized model,
# including weights and activation results.
vai_q_onnx.dump_model(
model,
dump_data_reader=None,
output_dir='./dump_results',
dump_float=False)
- model
- File path of the quantized model.
- dump_data_reader
- A data reader used for the dump. It generates inputs for the original model.
- output_dir
- String. The directory to save the dump results. Dump results are generated in output_dir after the function is successfully executed.
- dump_float
- Boolean. Determines whether to dump the float value of weights and activation results.
After successfully executing the command, the dump results are generated in the output_dir. Each quantized node's weights and activation results are saved separately in *.bin and *.txt formats.
In cases where the node output is not quantized, such as the softmax node, the float activation results are saved in *_float.bin and *_float.txt formats if the option "save_float" is set to True.
The following table shows examples of the dump results.
Batch Number | Quantized | Node Name | Saved files |
---|---|---|---|
1 | Yes | resnet_v1_50_conv1 |
{output_dir}/dump_results/quant_resnet_v1_50_conv1.bin {output_dir}/dump_results/quant_resnet_v1_50_conv1.txt |
2 | Yes | resnet_v1_50_conv1_weights |
{output_dir}/dump_results/quant_resnet_v1_50_conv1_weights.bin {output_dir}/dump_results/quant_resnet_v1_50_conv1_weights.txt |
2 | No | resnet_v1_50_softmax |
{output_dir}/dump_results/quant_resnet_v1_50_softmax_float.bin {output_dir}/dump_results/quant_resnet_v1_50_softmax_float.txt |