Recommendations - Recommendations - 57300

ZenDNN User Guide (57300)

Document ID
57300
Release Date
2026-03-13
Revision
5.2 English

It is recommended you use torch.no_grad() for optimal inference performance with zentorch.

CNN

For torchvision CNN models, set dynamic=False when calling for torch.compile as follows:

model = torch.compile(model, backend='zentorch', dynamic=False) 
with torch.no_grad():
    output = model(input)

NLP & RecSys

Optimize Hugging Face NLP models as follows.

model = torch.compile(model, backend='zentorch') 
with torch.no_grad():
    output = model(input)

Hugging Face Generative LLM Models

The zentorch.llm.optimize API has been deprecated. You can run generative models using torch.compile (model, backend="zentorch"), but for optimal performance we recommend using vLLM. See vLLM-zentorch Plugin for more details.