8.4.2. Performance Benchmarking - 5.2 English - 57404

AOCL User Guide (57404)

Document ID
57404
Release Date
2025-12-29
Version
5.2 English

AOCL-DLP includes comprehensive benchmarking tools for performance evaluation and optimization.

Building Benchmarks:

# Configure build with benchmarks enabled
cmake -DBUILD_BENCHMARKS=ON ..

# Build benchmarks
make -j$(nproc)

Running Benchmarks:

# Run GEMM benchmarks with configuration file
./build/bench/bench_gemm -f bench/configs/gemm_bench_f32_basic_config.yaml

# Run with verbose output
./build/bench/bench_gemm -f bench/configs/gemm_bench_f32_exhaustive_config.yaml --verbose

Benchmark Configuration with YAML:

AOCL-DLP benchmarking uses YAML configuration files similar to the testing framework. The benchmark YAML structure follows the same flexible parameter specification as testing YAML files.

Basic Benchmark YAML Structure:

gemm_tests:
  - name: "small_matrix"
    a_type: "f32"
    b_type: "f32"
    c_type: "f32"
    acc_type: "f32"
    storage_format: "row-major"
    transA: false
    transB: true
    # Matrix dimensions - supports all three input methods
    m: [1, 10, 64]              # List method
    n: [1, 10, 64]              # List method
    k: [1, 10, 64]              # List method
    alpha: [2.5, 0, -2.5]       # List method
    beta: [2.5, 0, -2.5]        # List method
    # Optional parameters
    lda: [10, 10, 64]
    ldb: [10, 10, 64]
    ldc: [10, 10, 64]
    mtagA: ["none", "pack", "pack"]
    mtagB: ["reorder", "pack", "none"]
    product_type: "simple"

Benchmark-Specific Parameters:

  • product_type: Specifies the benchmark product type (“simple”, “batch”, etc.)

  • mtagA, mtagB: Matrix optimization tags for performance comparison

  • tolerances: Accuracy thresholds for different data types

Configuration Files:

Benchmark configurations are stored in bench/configs/ directory:

  • gemm_bench_f32_basic_config.yaml: Basic float32 benchmarks

  • gemm_bench_f32_exhaustive_config.yaml: Comprehensive performance testing

For comprehensive benchmarking information including performance mode, configuration options, and optimization strategies, see the DLP-Benching Wiki.