A card-game rules engine in Lean4

In progress

A formally-specified rules engine for a Magic: the Gathering–like card game, built in Lean4. Two long-term ambitions: machine-checked correctness of the rules engine, and a provably-correct shortcut system for game-state loops. Medium-term target: a roguelike playable on top of it, where decks are built by acquiring cards and emblems across a game map.

The two ambitions

The whole project orients around two things, and they're what justify the cost of doing this in Lean4 at all:

Machine-checked correctness of the rules engine. Strong, formally-verified statements about how the engine behaves, including termination of effect resolution cycles, deterministic state transitions, and freedom from major bug classes.
A provably-correct shortcut system. Magic's rules accept that some game states are only reachable via demonstrable loops; the standard answer is "describe the loop, both players agree, fast-forward to its end." Doing this only when an engine-level theorem certifies it's safe turns a dangerous but powerful rule into a verified behaviour rather than a heuristic.

Both are eased substantially by Lean4's dependent types, and neither is realistic in a language with less reach at the type level.

Why Lean4

Building a card game engine as complicated as this requires symbolic interpretation of a domain-specific language (DSL). I investigated multiple candidate languages to build this system. Rust's type system did not have the expressiveness I wanted when building the DSL. Haskell was better in this area, but had a lot of strange design decisions I wasn't ready to accept and was still missing some expressiveness. Lean was a language I was already comfortable in due to previous projects and offered a full dependent type system. It also offers me the ability to create strong formal correctness guarantees of the engine.

Choosing Lean was also an excuse to push it into a domain it isn't normally used for, which is a great stress test for my abilities as well as a powerful learning experience.

Scope

"Magic: the Gathering-like" here means the engine has to support:

A stack with priority passing and resolution
Replacement effects and triggered abilities
Continuous effects with layered application
Per-object state distinct from per-card state
Demonstrable deterministic loops, resolved by the shortcut system above

Beyond ordinary Magic, the project is shaped from the start around custom cards, a roguelike mode where the deck is built by acquiring cards and emblems across a game map, and other non-standard formats. The engine has to be flexible at the card-text level and support dynamic run-time rewriting to fully achieve these goals.

Two design patterns under the hood

Two design patterns carry most of the conceptual weight, and they're worth naming because the next section is about how they're being applied.

HKD over the result data type

The core action types (EditHKD, SemActionHKD, ActionHKD, ActionGroupHKD) are parameterized by a type family f : ElemTy → Type, and the same inductive structure gets instantiated multiple ways:

Concrete form — sub-terms are fully-evaluated runtime values. This is the action that actually happened, the SemAction that's about to be applied, the Edit that's been resolved.
Description form — sub-terms are typed DSL expressions. This is what a card's text compiles to: an ActionExpr that describes an action subject to the binders, choices, and targets in scope.

The reuse is the point. Same conceptual structure, multiple readings; the evaluator bridges them. The choice of leaf representation is one place the pattern keeps earning its keep — the same HKD machinery accommodates leaves that carry their proof of well-typedness with them (Expr Γ t) and leaves that don't (RExpr t, the phantom-indexed version), so a description-form action type can be either fully proof-encoded or stripped down for serialization. A more recent experiment uses ActionHKD with a MatchExpr instantiation to represent conditional filtering of those same actions, reapplying the same scaffolding to a third role rather than building a parallel one.

This was an early decision and isn't heavily documented inside the code — it's mostly carried by the shape of the type signatures.

Extrinsic typing: data + `WT` predicate

The other pattern was a later decision. The DSL expression layer initially baked typing information directly into the inductive — fully intrinsic, types as indices. That hit a wall of type-universe issues. The way out was a pattern already used inside the Lean standard library: keep the data plain (an untyped AST, UExpr), put well-typedness in a separate Prop-valued predicate (WTExpr : CtxEnv → UExpr → Ty → Prop), and bundle them in a convenience structure (Expr Γ t := { raw : UExpr, wt : WTExpr Γ raw t }) for the smart-constructor surface.

This split is what makes the universe problems go away and what makes serialization, rewriting, and substitution tractable — the data type carries no proofs, and "what term is this?" and "is it well-typed here?" become separable concerns. The current rewrite extends the same idea up into Action / Plan / Choice / Target, which is the substance of the next section.

Current state

The codebase is mid-refactor and is not currently expected to compile. Foundational pieces — the DSL expression layer, the core card and game-state types, the state machine — are in shape; the effect engine has been rewritten twice, and the current rewrite is the substantial in-flight work.

The driving design decision of the current rewrite is around how the Action / Plan / Choice / Target layer encodes well-typedness, given that these constructs both bind variables and contain typed sub-expressions. The chosen direction:

Extrinsic typing. Leaves carry their result type as a cheap phantom index but no scoping context (Γ); the binder-scoping fact lives in a single per-structure WTScope : CtxEnv → … → Prop rather than threaded through constructor signatures. This makes serialization, rewriting, and substitution work without existential transport.
HKD only at the Action tier. The description/concrete duality is real and exercised only at Edit / SemAction / Action / ActionGroup. Plan, Choice, and Target are expression-space only and drop the HKD entirely.
Three tiers. Actions (HKD), Plan/Choice/Target (no HKD, RExpr leaves, structure-level WT), Expressions (unchanged — UExpr / WTExpr / Expr stay).

The one genuinely hard problem left is how the well-typedness predicate for plan_seq should thread bindings through a sequence: an earlier step can bind variables (via a choice or target) that later steps refer to, so later steps live in a context extended by what earlier steps introduced. The committed clause for this case is an explicit placeholder; deciding the right formulation here is the open question that's gating the rest of the rewrite.

Future plans

The medium-term goal is a fully functional Magic-like engine that supports the creation of new cards with novel abilities at runtime, backed by the verification guarantees described above. Beyond that, ambitions for the project include:

Support for compressing patterns across many objects into a single "chunk", e.g. storing 100 creatures each with one more health than the last as a single description rather than 100 cloned records.
A mid-game rewind feature to revert accidental misclicks, or to support a loose, timer-based player-priority system.
Translation between natural language and the internal card DSL, so players can write card effects in English and have them automatically work in the engine.

Stack

Lean 4 Dependent types HKD Formal methods Game engines Roguelike

Source

Repository is local-only at the moment; happy to share on request.