Skip to content

Godot — snapshot-test your game flow

Script a walkthrough of your game, mark the moments that matter, and let Dungbeetle fail CI when the gameplay changes — with a diff you can read. This guide runs the example game that ships in the CLI repository: title menu → collect three squares → game over.

By the end you'll have: a green headless run in ~3 seconds, a deliberate gameplay change caught as a two-line semantic diff, and a flake harness proving the run is deterministic.

Screencast — the whole loop, live: doctor → baseline → green ci → a collectible moves 40px → ci fails with the gameplay diff → revert → flake proves 5/5 deterministic runs.

Flow video — the terminal session replayed, then the failing visual-mode run's HTML report: the gameplay diff plus before/after/diff screenshots at every marker.

Prerequisites

  • A Godot 4.x binary (the editor download is fine).
  • The Dungbeetle CLI repository (the example lives in examples/game/), or your own project with the adapter installed.

Point Dungbeetle at your Godot binary once:

sh
export DUNGBEETLE_GODOT_PATH="/Applications/Godot.app/Contents/MacOS/Godot"

1. Look at the pieces

Three files make a game target:

The config (examples/game/dungbeetle.config.json) — a game target pointing at a Godot project and a walkthrough:

json
{
  "kind": "game",
  "name": "godot-demo",
  "engine": "godot",
  "project": "examples/game/godot",
  "walkthrough": "examples/game/walkthrough.json"
}

The walkthrough (examples/game/walkthrough.json) — the flow, with markers at menu, mid-run, and game over:

json
{
  "steps": [
    { "wait": 10 },
    { "screenshot": "menu" },
    { "input": "ui_accept" },
    { "waitFor": "Main:started == true", "timeoutTicks": 60 },
    { "input": "move_right", "mode": "down" },
    { "waitFor": "Main:game_over == true", "timeoutTicks": 600 },
    { "input": "move_right", "mode": "up" },
    { "wait": 10 },
    { "screenshot": "game-over" },
    { "assert": "Player:collected >= 3" }
  ]
}

The game's opt-in — nodes join the dungbeetle group and expose the fields that matter (examples/game/godot/player.gd):

gdscript
func _ready() -> void:
    add_to_group("dungbeetle")

func get_dungbeetle_state() -> Dictionary:
    return {"collected": collected}

2. Validate the setup

sh
npm run game:doctor
PASS game-walkthrough:godot-demo - Walkthrough is valid …
PASS game-adapter:godot-demo - Adapter 0.1.0 installed (protocol v1–v1, CLI speaks v1).
PASS game-engine:godot-demo - Engine binary … reports version 4.7.stable…
PASS game-determinism:godot-demo - Deterministic run enforced: seed 0, 60 physics ticks/s …

3. Capture the baseline

sh
npm run game:update

The run is fully headless — no window, no Xvfb — because the default semantic mode snapshots game state, not pixels. It takes about 3 seconds: the engine boots, the adapter holds the scene behind a deterministic gate, the walkthrough drives it tick by tick, and each marker records an allowlisted scene-tree state.

4. Re-run and compare — green

sh
npm run game:ci
✅ PASSED game:godot-demo

5. Break the game, read the diff

Move the third collectible 40px farther, in examples/game/godot/main.gd:

diff
-const ITEM_POSITIONS := [220.0, 320.0, 420.0]
+const ITEM_POSITIONS := [220.0, 320.0, 460.0]
sh
npm run game:ci
❌ FAILED game:godot-demo
  ~ $.markers.game-over.state.Player.position[0]: 398 → 438
  ~ $.markers.game-over.tick: 174 → 194

That's the point of semantic-first: the player travelled farther and the run ended 20 ticks later. A pixel tool would show you a red smear; this reads as a gameplay change. Revert the change and the run is green again.

6. Prove it's deterministic

sh
npm run game:flake
✅ game:godot-demo — 5/5 runs identical

dungbeetle flake --repeat N captures the target N times with no baseline and reports any run-to-run divergence per marker — the canary to wire into CI before you trust a new walkthrough.

7. Add screenshots (optional)

Set "mode": "visual" on the target and re-baseline. Each marker now also captures a screenshot (locally a window flashes briefly; on Linux CI use xvfb-run). Visual changes are advisory by default — reported next to the semantic diff, rendered as per-marker side-by-side / onion-skin comparisons in the cloud review UI, but they only gate the run if you set "screenshotMode": "strict".

8. Review in the cloud

Push a run to a Dungbeetle server and the walkthrough's semantic gameplay diff — marker states, positions, the pickup tick — is reviewable in a browser. In visual mode each marker (menu, mid-run, game-over) also carries a screenshot, so the review pairs the state diff with per-marker before/after/diff images — advisory by default: semantic state stays the gate.

Screencast: sign in → open the failing run → the marker-state diff plus per-marker screenshot comparisons → approve and promote the new baselines.

Next steps

  • The full config, step, and determinism reference: game snapshots.
  • Push runs to the cloud for per-marker visual review — see cloud.
  • Installing the adapter into your own project: the adapter.

Source-available: CLI under FSL-1.1-ALv2, cloud server under BUSL-1.1. See Licensing.