Memory Allocators - 57300

ZenDNN User Guide (57300)

Document ID
57300
Release Date
2025-08-18
Revision
5.1 English

Based on the model, if there is a requirement for a lot of dynamic memory allocations, a memory allocator can be selected from the available allocators which would generate the most optimal performance out of the model. These memory allocators override the system provided dynamic memory allocation routines and use a custom implementation. They also provide the flexibility to override the dynamic memory management specific tunable parameters (for example, logical page size, per thread, or per-cpu cache sizes) and environment variables. The default configuration of these allocators would work well in practice. However, you should verify empirically by trying out what setting works best for a particular model after analyzing the dynamic memory requirements for that model.

Most commonly used allocators are TCMalloc and jemalloc.

TCMalloc

TCMalloc is a memory allocator which is fast, performs uncontended allocation and deallocation for most objects. Objects are cached depending on the mode, either per-thread or per-logical CPU. Most allocations do not need to take locks. So, there is low contention and good scaling for multi-threaded applications. It has flexible use of memory and hence, freed memory can be reused for different object sizes or returned to the operating system. Also, it provides a variety of user-accessible controls that can be tuned based on the memory requirements of the workload.

jemalloc

jemalloc is a memory allocator that emphasizes fragmentation avoidance and scalable concurrency support. It has a powerful multi-core/multi-thread allocation capability. The more cores the CPU has, the more program threads, the faster jemalloc allocates. jemalloc classifies memory allocation granularity better leading to less lock contention. It provides various tunable runtime options such as enabling background threads for unused memory purging, allowing jemalloc to use THPs for its internal metadata, and so on.

Usage

You can install the TCMalloc and jemalloc dynamic libraries and use the LD_PRELOAD environment variable as follows:

Table 1. LD_PRELOAD environment variables in case of TCMalloc and jemalloc
Use this command TCMalloc jemalloc
Before you begin export LD_PRELOAD=/path/to/TCMallocLib/ export LD_PRELOAD=/path/to/jemallocLib/
For benchmarking LD_PRELOAD=/path/to/TCMallocLib/ < python benchmarking command> LD_PRELOAD=/path/to/jemallocLib/ <python benchmarking command>
To verify if TCMalloc or jemalloc memory allocator is in use lsof -p <pid_of_benchmarking_command> | grep tcmalloc lsof -p <pid_of_benchmarking_command> | grep jemalloc