Architecture - 57300

ZenDNN User Guide (57300)

Document ID
57300
Release Date
2025-08-18
Revision
5.1 English

When both vLLM and the zentorch package are installed, vLLM automatically detects the zentorch platform and replaces its default attention mechanism with the highly optimized zentorch PagedAttention kernel. This kernel leverages AVX 512 intrinsics and optimizations to accelerate computations on AMD EPYC™ CPUs. However, the plugin may also function on other x86 CPUs that meet the required ISA.

Further, we use zentorch to compile the LLM with torch.compile, replacing the native ops with zentorch's optimized ops.