Performance baselines
A performance target snapshots load-test metrics and compares them with Dungbeetle's numeric-tolerance engine — the same structured-snapshot-first approach applied to terminal and DOM output, now for performance. Three tool parsers ship today: k6 (the default), Apache Benchmark (ab), and autocannon; all normalize into the same metrics shape, so tolerances and diffs work identically.
Requirement
k6 must be installed and on PATH (it is not bundled, like the optional Playwright browser) — unless you snapshot a pre-exported summary with summary.
Configure a target
{
"kind": "performance",
"name": "api-load",
"script": "perf/script.js",
"metrics": ["http_req_duration", "http_reqs", "checks"]
}Options:
script— a k6 script. Dungbeetle runsk6 run --summary-export …and snapshots the result.summary— alternatively, point at an existing k6--summary-exportJSON file to snapshot without running k6 (useful in CI where k6 ran separately).metrics— restrict the snapshot to selected metric names (default: all).
Snapshot model
A performance snapshot normalizes a k6 summary into a stable, diffable shape:
{
"kind": "performance",
"tool": "k6",
"metrics": {
"http_req_duration": { "avg": 12.346, "min": 5.1, "med": 11, "max": 45.2, "p90": 20.1, "p95": 28.4 },
"http_reqs": { "count": 1500, "rate": 249.8 },
"checks": { "passes": 1500, "fails": 0, "value": 1 }
}
}- k6's percentile keys (
p(95)) are renamed to path-friendly names (p95). - Values are rounded to 3 decimals for readable baselines.
- Non-numeric stats are dropped.
Tolerances
Performance numbers vary run to run, so comparison relies on comparison.numericTolerance — set a relative tolerance rather than expecting exact equality:
{ "comparison": { "numericTolerance": { "absolute": 0, "relative": 0.2 } } }A metric that moves within tolerance passes; one that regresses beyond it fails with a percentage-delta diff:
~ http_req_duration.p95: 28.4 → 60 (+111.3%)Apache Benchmark (ab)
Set tool: "ab" and give the target the benchmark command to run (or a saved output file via summary):
{
"kind": "performance",
"name": "homepage-bench",
"tool": "ab",
"command": "ab -n 100 -c 5 http://127.0.0.1:8000/",
"metrics": ["requests", "total_ms", "percentiles_ms", "document"]
}The plain-text summary normalizes into the same shape: requests (complete/failed/per_second), duration_ms, the connection-times table (connect_ms / processing_ms / waiting_ms / total_ms), percentiles_ms (p50–p100), and document.length_bytes — kept on purpose, because a changed response size means the page itself changed:
~ document.length_bytes: 58 → 414 (+613.8%)ab ships with macOS and the Apache httpd-tools package on most Linux distributions, so this is often the zero-install way to put a latency baseline on an endpoint.
autocannon
Set tool: "autocannon" with a --json command (or a saved output via summary):
{
"kind": "performance",
"name": "home-load",
"tool": "autocannon",
"command": "npx autocannon -d 5 -c 10 --json http://127.0.0.1:8000/",
"metrics": ["latency", "counts"]
}The JSON summary normalizes into latency / requests / throughput stat groups (mean, stddev, min/max, full percentiles), counts (errors, timeouts, status classes — a regression here is signal even when timing holds), and run (duration, connections, pipelining — so a changed benchmark shape is itself a named diff, not silently different numbers). Timestamps never land in the snapshot.
Try it
The examples/perf example uses a committed summary.json, so it needs no k6 install:
dungbeetle update --config examples/perf/dungbeetle.config.json # write the baseline
dungbeetle test --config examples/perf/dungbeetle.config.json # compare