vai_q_caffe Usage - 1.2 English

Vitis AI User Guide (UG1414)

Document ID

UG1414

Release Date

2020-07-21

Version

1.2 English

The vai_q_caffe quantizer takes a floating-point model as an input model and uses a calibration dataset to generate a quantized model. In the following command line, [options] stands for optional parameters.

vai_q_caffe quantize -model float.prototxt -weights float.caffemodel [options]

The options supported by vai_q_caffe are shown in the following table. The three most commonly used options are weights_bit, data_bit, and method.

Table 1. vai_q_caffe Options List
Name	Type	Optional	Default	Description
model	String	Required	-	Floating-point prototxt file (such as float.prototxt).
weights	String	Required	-	The pre-trained floating-point weights (such as float.caffemodel).
weights_bit	Int32	Optional	8	Bit width for quantized weight and bias.
data_bit	Int32	Optional	8	Bit width for quantized activation.
method	Int32	Optional	1	Quantization methods, including 0 for non-overflow and 1 for min-diffs. The non-overflow method ensures that no values are saturated during quantization. It is sensitive to outliers. The min-diffs method allows saturation for quantization to achieve a lower quantization difference. It is more robust to outliers and usually results in a narrower range than the non-overflow method.
calib_iter	Int32	Optional	100	Maximum iterations for calibration.
auto_test	-	Optional	Absent	Adding this option will perform testing after calibration using a test dataset specified in the prototxt file.
test_iter	Int32	Optional	50	Maximum iterations for testing.
output_dir	String	Optional	quantize_results	Output directory for the quantized results.
gpu	String	Optional	0	GPU device ID for calibration and test.
ignore_layers	String	Optional	none	List of layers to ignore during quantization.
ignore_layers_file	String	Optional	none	Protobuf file which defines the layers to ignore during quantization, starting with ignore_layers
sigmoided_layers	String	Optional	none	List of layers before sigmoid operation, to be quantized with optimization for sigmoid accuracy
input_blob	String	Optional	data	Name of input data blob
keep_fixed_neuron	Bool	Optional	FALSE	Remain FixedNeuron layers in the deployed model. Set this flag if your targeting hardware platform is DPUCAHX8H

Example:

  1. quantize:                           vai_q_caffe quantize -model float.prototxt -weights float.caffemodel -gpu 0
  2. quantize with auto test:            vai_q_caffe quantize -model float.prototxt -weights float.caffemodel -gpu 0 -auto_test -test_iter 50
  3. quantize with Non-Overflow method:  vai_q_caffe quantize -model float.prototxt -weights float.caffemodel -gpu 0 -method 0
  4. finetune quantized model:           vai_q_caffe finetune -solver solver.prototxt -weights quantize_results/float_train_test.caffemodel -gpu 0
  5. deploy quantized model:             vai_q_caffe deploy -model quantize_results/quantize_train_test.prototxt -weights quantize_results/float_train_test.caffemodel -gpu 0