AI Agent Tools — Architecture Guardrails
The AI agent panel's tools live under backend/src/ai/tools/. They are the
LLM's entry points for read and write operations. The architectural rule
is simple:
Tools must not mutate the database directly. Every write goes through a canonical service in
backend/src/domain/agentOps/(Layer-3, in progress) so business invariants are enforced once, regardless of whether the call comes from an HTTP route or the AI panel.
Two automated guardrails enforce this:
Layer 1 — Arch test
backend/tests/architecture/no-prisma-writes-in-ai-tools.test.ts
Scans every .ts file under src/ai/tools/ for:
- Prisma write methods (
create,createMany,update,updateMany,upsert,delete,deleteMany). - Prisma escape-hatches (
$transaction,$queryRaw,$queryRawUnsafe,$executeRaw,$executeRawUnsafe).
Any hit in a file that is not in AMNESTY fails CI. Amnesty entries
represent files known to bypass the rule today, scheduled for Layer-3
extraction — the list should shrink to zero, not grow. A third assertion
emits an advisory log line when an amnesty file is now clean and ready
to be removed from the list.
Run:
cd backend
npx vitest run tests/architecture/no-prisma-writes-in-ai-tools.test.ts
Layer 2 — Inventory script
backend/scripts/inventory-ai-tool-writes.ts
Statically walks the same tree and emits a markdown report at
docs/ai-agent-fab/AI_TOOL_DB_INVENTORY.md listing every write, read,
escape-hatch, and prisma-client import. The punch list at the top is
the Layer-3 migration backlog.
Re-run after any change under src/ai/tools/:
cd backend
npx tsx scripts/inventory-ai-tool-writes.ts
What about reads?
Reads (findUnique, findMany, aggregate, etc.) are NOT blocked by
Layer 1. They will be the focus of a later pass — the priority order is
writes (state-changing, invariant-laden) before reads (mostly downline
scope filters that already live in helper functions). Track read access
through the inventory's READ rows.
Layer 4 — Parity tests
backend/scripts/parity-layer3.mjs
For every extracted domain operation, the parity script runs the SAME logical scenario through two paths:
- Path A — direct domain call (the shape HTTP routes will use post-migration).
- Path B — through the AI tool executor (
executeGivePoints,executeCreatePlayer,executeEditLimit).
Both must produce byte-identical DB state. Any divergence means the AI tool's translation layer (its argument unpacking, defaults, side effects) has drifted from the canonical. Layer 1 prevents the tool from inventing its own write path; Layer 4 proves the tool's argument plumbing is correct.
Each test creates a throw-away fixture (timestamp-suffixed user) so multiple runs don't clash and the script is idempotent. Runs the scenario through Path A, captures the diff, restores the fixture, runs Path B, captures that diff, asserts both are equal — then cleans up.
Run inside the BE container:
docker exec -w /app strykr-dev1-backend node scripts/parity-layer3.mjs
Current coverage (6 cases, all passing on dev1):
- credit parity:
creditPlayer↔executeGivePoints(positive amount) - debit parity:
debitPlayer↔executeGivePoints(negative amount) - createPlayer parity: domain shape + allocation shape
- editPlayerLimit parity: creditLimit branch (composes creditPlayer)
- editPlayerLimit parity: minStake policy branch
The first version of these tests caught a real policy divergence:
checkUsernameForAi (AI side) rejects dashes; bare createPlayer
accepts them. That's by-design AI-tool conservatism, not a domain bug —
but the test forced us to notice and document it. That's exactly the
value Layer 4 provides.
True route↔tool parity (TODO)
The current parity tests compare "Path A direct domain call" vs "Path B
AI tool executor" — both converge on the domain service. They catch AI
tool drift. They do NOT yet exercise the HTTP route side, because the
routes still have their own inline Prisma writes. Migrating the routes
(POST /agent/allocate-to-sub-agent, PATCH /admin/players/:id/credit-limit,
POST /agent/invite-player, etc.) to call the same domain services is
the next step — at which point we can swap Path A to call the route
handler directly and the test becomes a true route↔tool comparison.
What's next
- Layer 5 — schema invariants:
CHECKconstraints (balance_points >= 0,credit_limit >= 0, etc.), partial unique indexes for idempotency. - Layer 6 — reconciliation job: per-period verification that
SUM(Transaction.amount)per user equals currentbalance_points, double-entry balances to zero, no over-limit balances.
These are domain-agnostic — they apply identically to credit, settlement, betting, hedging, limits, and anything else added later. We never enumerate invariants by hand; we make it structurally impossible for a tool to bypass the canonical service.