Running Bench Framework - 5.2 English - 57404

AOCL User Guide (57404)

Document ID
57404
Release Date
2025-12-29
Version
5.2 English
$ ./bench.py <benchmark_name> <common_options> <benchmark_specific_options>

Parameters:

<benchmark_name>  = {gbm,tbm,fbm}
                    gbm          Googlebench
                    tbm          TinyMembench
                    fbm          Fleetbench

<common_options>  = -x<core_id> -r [start] [end] -t "<iterator_value>" <LibMem_function> -perf [p,g,b,d] -bestperf

                  -x <core_id> : Enter the CPU core on which you want to run the benchmark.
                  -r [start] [end] : start and end size range in Bytes.(Not applicable for Fleetbench)
                                     Format: NUMBER[UNIT]
                                     where UNIT can be B, KB, MB, or GB(case insensitive).
                                     The default unit is Bytes.
                  -t "iter_value"  : increments the start size by "iter_value".
                                     (0 stands for size<<1; other +ve integers stands for incremental iterations.)
                  LibMem_function  : mem and str functions
                                    (memcpy,memmove,memset,memcmp,memchr,mempcpy
                                    strcpy,strncpy,strcmp,strncmp,strlen,strcat,strncat,strspn,strstr,strchr)
                  -perf            : Performance report type
                                    l - Performance analysis for LibMem
                                    g - Performance analysis for Glibc
                                    c - Comparison report between LibMem old and new
                                    d - Default report Glibc vs. LibMem
                  -bestperf        : Runs benchmark 3 times and selects the best throughput
                                    for each size from those iterations (specific to GBM and TBM)

<GBM_specific_option> = -m <mode> -a <align> -s <cache_spill> -p <page_option> -o <overlap> -preload <y,n> -i<repetitions> -w<warm_up time>

                      -m <c, u>    : cached  & uncached behaviour
                      -a <a, u, d> : aligned (src and dst alignment are equal)
                                     un-aligned (src and dst alignment are NOT equal)
                                     default alignment is random.
                      -s <l, m>    : Less spill and more spill (applicable with align mode only)
                      -p <x, t>    : Page-cross and Page-Tail scenario
                      -o <f, b, d> : [Memmove only]Forward overlap, Backward overlap and Default overlap
                                    (Default is 'd',both forward and backward overlaps)
                    -preload <y,n> : Running with LD_PRELOAD option = y & Running with static binaries = n
                    -i<repetitions>: Number of repetitions for consistent performance runs
                  -w<warm_up time> : Minimum Warmup time in seconds.
                  NOTE: -a and -p are mutually exclusive options

<FBM_specific_option> = -mem_alloc <tcmalloc, glibc> -i<repetitions>
                        -mem_alloc : Specify the memory allocator (default = glibc)
                    -i<repetitions>: Number of repetitions for consistent performance runs (default = 100)

<TBM_specific_option> = None

Examples:

Benchmark Help option:

$ ./bench.py -h

Running Google Benchmark:

$ ./bench.py gbm memcpy -r 8B 16B -m u -t "1" -x 16

Runs the Google Benchmark for Un-Cached Mode Memcpy for sizes[8,9,..16] on core -16

$ ./bench.py gbm memcpy -r 8B 32KB -s m -x 16

Runs GBM for Cached memcpy with More-cache spill

Running TinyMembench:

$ ./bench.py tbm strcpy -r 8B 4KB -x 47

Runs TinyMembench for strcpy function for sizes [8, 16, 32,..4096B] on core - 47

Running Fleetbench:

$ ./bench.py fbm memset -x 47 -i 100

Runs Fleetbench for memset benchmarking on core - 47 for 100 iterations