panurus

Go Benchmark Runner

This package provides a robust, high-precision benchmark runner for Go applications. It goes beyond standard Go testing benchmarks by offering detailed statistical analysis, memory profiling, latency distribution visualization, and automated health checks.

Overview

The benchmark package is designed to measure the performance of a specific operation (the “unit of work”) under concurrent load. It orchestrates the execution of this work across multiple goroutines, captures precise latency data, and generates a comprehensive report.

Key Features

Usage

The core entry point is the generic RunBenchmark function, which accepts a Config object to control execution parameters.

Function Signature

// Config controls the benchmark execution.
type Config struct {
Workers        int           // Number of concurrent goroutines
Duration       time.Duration // Time to record execution
WarmupDuration time.Duration // Time to run before recording
RateLimit      float64       // Total Ops/Sec limit (0 = Unlimited/Closed-Loop)
}

// RunBenchmark executes the benchmark.
// T is the type of data created by setup() and passed to work().
func RunBenchmark[T any](
cfg Config,                   // Configuration object
setup func() T,               // Function to prepare data for each op
work func(T) error,           // The function to benchmark
) Result

Example

package main

import (
"fmt"
"time"
"your/package/benchmark" // Import the runner
)

func main() {
// Define the configuration
cfg := benchmark.Config{
Workers:        10,            // 10 concurrent workers
Duration:       5*time.Second, // Run for 5 seconds
WarmupDuration: 1*time.Second, // Warmup to stabilize pools/JIT
RateLimit:      0,             // 0 = Full Speed (Closed-Loop)
}

    // Run the benchmark
    result := benchmark.RunBenchmark(
        cfg,
        func() int {        // Setup: Prepare data (not timed)
            // Example: Create a payload or connection
            return 42 
        },
        func(input int) error {   // Work: The operation to measure (timed)
            // Simulate work
            if process(input) != nil {
                return fmt.Errorf("failed")
            }
            return nil
        },
    )

    // Print the detailed report to stdout
    result.Print()
}

Output Breakdown

The Result.Print() method outputs a structured report divided into several sections:

1. Main Metrics

Basic throughput and volume statistics.

2. Latency Distribution

A detailed look at how long operations took.

3. Stability Metrics

Measures how consistent the system is.

4. Memory & GC

5. Latency Heatmap

An ASCII bar chart visualizing the distribution of latencies. It uses color coding (Green/Yellow/Red) to indicate frequency density.

Range           Freq    Distribution Graph
100µs-200µs     500     ██████ (5.0%)
200µs-400µs     8000    ██████████████████████ (80.0%)
...

6. Throughput Timeline (Sparkline)

A condensed graph showing performance over time (1-second buckets). Helps identify degradation or cold starts.

Timeline: [ ▅▇██▆▄ ] (Max: 5000 ops/s)

7. Analysis & Recommendations

The runner automatically evaluates the results and prints warnings or pass/fail statuses:

Limitations & Considerations

While this runner is robust for general application testing, users should be aware of the following constraints:

1. Memory Consumption at Scale

The runner retains every latency sample in memory to provide precise percentiles (P99, P99.9) without approximation errors.

2. “Setup” Function Blocking

The setup() function runs sequentially inside the worker loop.

3. Nanosecond Precision

The runner uses standard time.Now() and time.Since().

4. Coarse Histogram

The visualization is hardcoded to 20 buckets using an exponential scale.

5. Stop-Time Latency

The runner signals workers to stop using an atomic flag, but it waits for the current operation to finish.