Running Quantized Models - Running Quantized Models - 57300

ZenDNN User Guide (57300)

Document ID
57300
Release Date
2026-04-13
Revision
5.2.1 English

Quantized models can be run using vLLM with the zentorch backend. No additional loading code (such as zentorch.load_quantized_model) is required — simply point vLLM to the quantized model directory. Refer to vLLM-zentorch Plugin for detailed instructions.