Game snapshots

A game target scripts a walkthrough of your game — inputs, waits, assertions — and snapshots semantic game state at named markers, with optional screenshots at the same markers. The state diff is the gate, so a gameplay change reads like one:

~ $.markers.game-over.state.Player.position[0]: 398 → 438
~ $.markers.game-over.tick: 174 → 194

The CLI is engine-agnostic: it launches the engine and speaks a small versioned protocol (stdio JSON-lines) to a thin engine-side adapter. Godot 4.x is the first supported engine.

Configure a target

json

{
  "kind": "game",
  "name": "godot-demo",
  "engine": "godot",
  "project": "examples/game/godot",
  "walkthrough": "examples/game/walkthrough.json"
}

Options:

engine (required) — adapter id; "godot" is the only engine today.
project (required) — the engine project directory (contains project.godot).
walkthrough (required) — path to the walkthrough script (JSON, below).
mode — "semantic" (default) runs fully headless and captures state only; "visual" adds one screenshot per marker and needs a display (a brief window locally, xvfb-run on Linux CI).
enginePath — engine binary. Falls back to DUNGBEETLE_GODOT_PATH, then godot on PATH.
seed — RNG seed applied before the run (default 0).
physicsFps — fixed physics tick rate (default 60).
screenshotMode — "advisory" (default): visual changes are reported but don't fail the run; "strict": visual changes gate like any other diff.
pixelTolerance — visual tolerance for this target (maxChangedRatio, perChannelThreshold). Defaults to the game kind's non-zero tolerance (0.002 / 3), which absorbs GPU/driver rasterization drift, not gameplay changes.
markers — per-marker overrides: { "game-over": { "pixelTolerance": { "maxChangedRatio": 0 } } }.
timeoutMs — whole-run watchdog (default: the lifecycle wait timeout). A broken game script can never hang CI: silence is fatal.

Walkthrough scripts

A walkthrough is a JSON file with a steps array. Five step types:

json

{
  "description": "Menu → collect three squares → game over.",
  "steps": [
    { "wait": 10 },
    { "screenshot": "menu" },
    { "input": "ui_accept" },
    { "waitFor": "Main:started == true", "timeoutTicks": 60 },
    { "input": "move_right", "mode": "down" },
    { "waitFor": "Player:collected >= 3", "timeoutTicks": 600 },
    { "input": "move_right", "mode": "up" },
    { "screenshot": "game-over" },
    { "assert": "Player:collected >= 3" }
  ]
}

input — press an input-map action (not a raw key: action-level injection at physics-tick boundaries is measurably deterministic; raw event injection is not). mode is "tap" (default; ticks sets the hold duration), "down", or "up".
wait — advance N physics ticks.
waitFor — poll a predicate each tick until true; timeoutTicks is mandatory, so a walkthrough can never wait forever.
screenshot — a named marker: captures semantic state always, plus a screenshot in visual mode. Names are kebab-case and unique; they key the snapshot, the baseline PNGs, and the review UI sections.
assert — a predicate that must hold; fails the step (with its index) otherwise.

Predicates read node properties by scene path: "NodePath:property <op> value" with ==, !=, >=, <=, >, <. "Main" (the scene root's name) and "." both address the root.

Exposing game state

Semantic state is an allowlist: only nodes in the engine-side dungbeetle group are captured (class + position), and a node can contribute game-specific fields via get_dungbeetle_state():

gdscript

func _ready() -> void:
    add_to_group("dungbeetle")

func get_dungbeetle_state() -> Dictionary:
    return {"collected": collected, "lives": lives}

Snapshot model

json

{
  "kind": "game",
  "engine": "godot",
  "markers": {
    "menu":      { "tick": 6,  "state": { "Player": { "position": [100, 160] } } },
    "game-over": { "tick": 174, "state": { "Player": { "position": [398, 160] } } }
  },
  "screenshots": { "menu": { "sha256": "…" }, "game-over": { "sha256": "…" } }
}

Engine and adapter versions are runtime metadata — an engine patch release never diffs a baseline. Version compatibility is doctor's job.
Screenshots are stored as digests in the baseline; the PNGs live alongside it as <name>.<marker>.png.
Marker tick is part of the snapshot: the run ending 20 ticks later is a gameplay change worth seeing.

Determinism

Enforced by the adapter on every run — there is no opt-out: seeded RNG, fixed physics timestep, vsync off, action-level input injection. On identical hardware this measures byte-identical across runs (state and pixels).

What that means across platforms:

scope	semantic state	pixels
same machine	byte-identical	byte-identical
same OS + GPU class	identical	default tolerance absorbs driver drift
across OS / GPU	identical — promised	not promised — generate baselines where you compare; screenshots stay advisory

Verify it empirically with the flake harness:

dungbeetle flake --config dungbeetle.config.json --repeat 5

which captures each target N times with no baseline and reports run-to-run divergence per marker (non-zero exit on any flake — wire it into CI).

Doctor checks

dungbeetle doctor reports, per target: project exists, walkthrough valid (with step indices on every issue), adapter installed + protocol compatibility (with an upgrade hint naming which side to update), engine version (runs --version), the enforced determinism knobs, marker-override typos, and a size warning for visual mode on anonymous pushes.

The adapter

The Godot addon lives at adapters/godot in the CLI repository: copy addons/dungbeetle/ into your project and enable the plugin (or register the Dungbeetle autoload by hand). It is inert outside CLI-launched runs — shipping it in your game costs nothing. Engine breakage is fixed in adapter patch releases, never as CLI churn; see the adapter README for the supported Godot versions and compatibility policy.

Try it

The Godot example walks the whole loop — green run, deliberate gameplay change, semantic diff — in about two minutes.

Game snapshots ​

Configure a target ​

Walkthrough scripts ​

Exposing game state ​

Snapshot model ​

Determinism ​

Doctor checks ​

The adapter ​

Try it ​

Game snapshots

Configure a target

Walkthrough scripts

Exposing game state

Snapshot model

Determinism

Doctor checks

The adapter

Try it