A multiplayer poker stack, built with AI agents

Complete

A Rust Cargo workspace — Texas Hold'em rules engine, MCCFR trainer, async TCP server with SQLite persistence, and a Tauri / SolidJS desktop client — written almost entirely through AI coding agents as a deliberate study of where they help and where they fall apart.

Why

I wanted a project complex enough to stress current AI programming agents in realistic ways: persistent state, concurrent clients, real game rules, network protocol design, and a long enough lifespan that architectural debt would actually start to bite. Poker fits — the rules are non-trivial, side pots are complicated, and the client/server split forces decisions about who owns what.

The constraint was that I would not write code by hand. I would direct, review, and course-correct, but agents would do the typing. The point was to learn what that workflow actually feels like on a full-sized project.

Architecture

A single Cargo workspace, six Rust crates, plus a Tauri 2 + SolidJS shell in a submodule:

  • poker-engine — pure-Rust rules library: cards, a ~50 ns / 7-card evaluator, side-pot-aware betting, the EngineEvent stream, and the wire types. No I/O.
  • poker-trainer — external-sampling MCCFR with a pausable GameTree stepper, blueprint persistence, and a five-persona bot pool for dataset generation.
  • poker-server — async tokio TCP server. SQLite via sqlx, Argon2id passwords, length-prefixed MessagePack framing, idle / rate / wire-size limits, and reconnect with mid-hand seat takeover.
  • poker-client-core — pure synchronous state machine: Intent in, Effect stream + ClientView projection out. No tokio, no I/O, no platform code. Every transition is unit-tested.
  • poker-client-transport-native — tokio TCP transport, atomic session-key persistence, and a NativeClient facade that the Tauri shell consumes as a sync API.
  • poker-client-headless — test harness with an InMemoryTransport and a scripted-scenario DSL, used to drive the full client end-to-end against a real (or in-memory) server without any UI. Added after the failed Flutter implementation proved the AI needed significantly more power to create unit and integration tests.

The principles that hold it together — strict correctness with full-flow tests, a UI that only projects state, and resilience to interruption via server-driven resync — are pinned in AGENTS.md and enforced by an openspec/specs/ directory carrying normative requirements per subsystem.

What worked, what didn't

Patterns I'd carry into future agent-assisted work:

  • Where agents are a clear force-multiplier: converting ideas into simple proof-of-concept code, working together with a human in the loop to plan larger code structures, building out boilerplate, and implementing relatively small, easily testable portions of code.
  • Where they struggled: building out a fully correct betting system which tracked less common rules like side-pots, short all-ins, and dead money. Also, building out frontend code where unit tests are unable to realistically catch everything proved fruitless.
  • Architecture is still my job. The first client (Flutter) was scrapped after the UI and state logic became intertwined, leading to stalled development when several breaking bugs had to be caught by human testing. The rewrite ("Rust state machine core + transport + headless harness, then a thin Tauri shell on top") came from manually solving the testing problem by minimizing the untestable frontend code. Once that decomposition existed, the agent re-implementation was fast and largely uneventful.
  • Introduce strict rules for specs carefully. Adding openspec as a dependency later on revealed mixed results; fixing hard requirements ensured the code matched a specification exactly, but lead to an increase of token usage and a loss of the agent's ability to adjust plans as blockers come up.

The broad takeaway: current agents are excellent at basic development, but need oversight on more difficult parts and for steering broad architectual decisions in the right direction. The leverage is real, but only if you stay the one making structural decisions and manually verify the most technical portions are designed correctly.

Stack

Rust tokio SQLite / sqlx Argon2id MessagePack Tauri 2 SolidJS MCCFR AI coding agents

Source

https://github.com/JoeFlet/vibe-poker