Quantized models can be run using vLLM with the zentorch backend. No
additional loading code (such as zentorch.load_quantized_model) is required — simply point vLLM to the
quantized model directory. Refer to vLLM-zentorch Plugin
for detailed instructions.