This package provides a robust, high-precision benchmark runner for Go applications. It goes beyond standard Go testing benchmarks by offering detailed statistical analysis, memory profiling, latency distribution visualization, and automated health checks.
The benchmark package is designed to measure the performance of a specific operation (the “unit of work”) under concurrent load.
It orchestrates the execution of this work across multiple goroutines, captures precise latency data, and generates a comprehensive report.
The core entry point is the generic RunBenchmark function, which accepts a Config object to control execution parameters.
// Config controls the benchmark execution.
type Config struct {
Workers int // Number of concurrent goroutines
Duration time.Duration // Time to record execution
WarmupDuration time.Duration // Time to run before recording
RateLimit float64 // Total Ops/Sec limit (0 = Unlimited/Closed-Loop)
}
// RunBenchmark executes the benchmark.
// T is the type of data created by setup() and passed to work().
func RunBenchmark[T any](
cfg Config, // Configuration object
setup func() T, // Function to prepare data for each op
work func(T) error, // The function to benchmark
) Result
package main
import (
"fmt"
"time"
"your/package/benchmark" // Import the runner
)
func main() {
// Define the configuration
cfg := benchmark.Config{
Workers: 10, // 10 concurrent workers
Duration: 5*time.Second, // Run for 5 seconds
WarmupDuration: 1*time.Second, // Warmup to stabilize pools/JIT
RateLimit: 0, // 0 = Full Speed (Closed-Loop)
}
// Run the benchmark
result := benchmark.RunBenchmark(
cfg,
func() int { // Setup: Prepare data (not timed)
// Example: Create a payload or connection
return 42
},
func(input int) error { // Work: The operation to measure (timed)
// Simulate work
if process(input) != nil {
return fmt.Errorf("failed")
}
return nil
},
)
// Print the detailed report to stdout
result.Print()
}
The Result.Print() method outputs a structured report divided into several sections:
Basic throughput and volume statistics.
A detailed look at how long operations took.
Measures how consistent the system is.
StdDev / Mean. Used to grade stability (e.g., <5% is “Excellent”).An ASCII bar chart visualizing the distribution of latencies. It uses color coding (Green/Yellow/Red) to indicate frequency density.
Range Freq Distribution Graph
100µs-200µs 500 ██████ (5.0%)
200µs-400µs 8000 ██████████████████████ (80.0%)
...
A condensed graph showing performance over time (1-second buckets). Helps identify degradation or cold starts.
Timeline: [ ▅▇██▆▄ ] (Max: 5000 ops/s)
The runner automatically evaluates the results and prints warnings or pass/fail statuses:
While this runner is robust for general application testing, users should be aware of the following constraints:
The runner retains every latency sample in memory to provide precise percentiles (P99, P99.9) without approximation errors.
The setup() function runs sequentially inside the worker loop.
setup() time is excluded from Latency metrics, it is included in the Real Throughput calculation.setup() is slow (e.g., creates a complex object), you may see confusing results: low Latency (fast work) but low Throughput (slow loop).The runner uses standard time.Now() and time.Since().
The visualization is hardcoded to 20 buckets using an exponential scale.
The runner signals workers to stop using an atomic flag, but it waits for the current operation to finish.