Skip to main content

Agent AI Assistant — Design Doc

Architecture + handoff notes for the agent FAB shipped on devagentai. A future LLM (or human) should be able to read this and pick up the work without re-discovering the wheel. No code — text only.


1. Overview

The agent FAB is a chat-style assistant embedded in the agent panel that lets agents express what they want in natural language and have it executed against the platform. It's built on the existing AI orchestrator in backend/src/ai/ (which already powered a player-side assistant) and adds an agent-specific layer:

  • A set of agent-only tools that read or mutate the agent's downline data
  • A reports engine that turns those queries into downloadable artifacts
  • A confirmation flow for destructive actions
  • A frontend FAB mounted on /agent/* routes
  • UI cards that render the structured tool output instead of raw text
  • A proactive-notification layer that pushes events back to agents

The orchestrator (Claude Sonnet 4) sees a tool catalogue filtered by the caller's role and the system prompt, then loops "ask tool → observe → decide" up to ten iterations per user turn. Tools return structured data; the frontend renders typed cards from that data; the LLM never owns visual layout.


2. Repo layout (what lives where)

Backend — backend/src/

ai/
├── core/
│ ├── agentService.ts Main orchestrator: build context, run Claude loop, dispatch tools.
│ ├── conversationStore.ts In-Redis conversation history.
│ └── toolRegistry.ts Singleton registry; role-gated dispatch.
├── tools/
│ ├── index.ts Register all tools + export `allTools`.
│ ├── matchTools.ts Match/odds queries (player-side).
│ ├── betTools.ts placeBet + executeBet (used by both player + agent).
│ ├── accountTools.ts getBalance / getUser (player-side).
│ ├── adminTools.ts Admin-only balance adjustments.
│ ├── navigationTools.ts Route-change tool.
│ └── agentTools.ts ★ All agent-FAB tools live here.
├── reports/
│ ├── templates.ts 4 canonical templates + xlsx/csv/pdf renderers.
│ ├── customReportSpec.ts Spec schema + per-source allowlist + validator.
│ ├── customReportBuilder.ts Validated spec → Prisma → in-memory groupBy → buffer.
│ └── artifactStore.ts Disk + Redis artifact lifecycle.
├── notifications/
│ └── agentNotifier.ts notifyLargeBet / notifyPlayerLogin / notifyMarketSettled.
├── ui/
│ └── components.ts UI component-type registry.
└── types.ts Tool / UIComponent / data types.

routes/
├── ai.ts /api/ai/command, /command/confirm, /command/cancel,
│ and (added in this work) /api/ai/reports/:artifactId.
├── orders.ts Post-success hook calls notifyLargeBet.
└── auth.ts /credential-login + /dev-login call notifyPlayerLogin.

services/
└── settlement.ts processSettlement now fires notifyMarketSettled
fire-and-forget after each per-market settle.

Frontend — strykr-fe/src/

app/agent/
└── layout.tsx Mounts AIPanel + AIMobileSheet in mode='command'.
Also hosts the agent sidebar.
components/
├── ai-chat/
│ ├── AIFloatingButton.tsx Shared FAB button (player + agent).
│ ├── AIChatPanel.tsx Chat tab content (player-only).
│ ├── CommandChat.tsx Command tab content; passes page-context metadata.
│ └── ui/
│ ├── index.tsx Switch over UIComponentType to renderer.
│ ├── AgentCards.tsx 11 agent-specific card renderers.
│ └── (existing player-side cards)
├── layout/
│ ├── AIPanel.tsx Desktop panel; mode='tabs'|'command' prop.
│ └── AIMobileSheet.tsx Mobile bottom-sheet; same mode prop.
lib/
├── aiCommandApi.ts REST client + UI/data type definitions.
└── features.ts AGENT_AI_FAB_ENABLED flag.
store/
├── aiPanelStore.ts Panel open state + active tab.
└── aiCommandStore.ts Conversation, pending-action state.

Infrastructure

docker-compose.dev1.yml      Named volume ai-reports:/tmp/ai-reports.
backend/Dockerfile Pre-creates /tmp/ai-reports with hannibal owner
so the volume mount inherits correct perms.

Tests

backend/tests/ai/tools/
└── agentTools.test.ts 50 unit tests covering role gating, scoping,
destructive-tool guards, sharing.
backend/tests/ai/reports/
└── customReportSpec.test.ts 19 tests covering the spec security gate.

Docs

docs/ai-agent-fab/
├── PHASE0_SPEC.md Original phase-by-phase design contract.
├── FEATURES.md One-pager (user-facing).
└── DESIGN.md This file.

3. Request lifecycle

A single "user asked X" round trip:

  1. User types in the FAB on /agent/*. CommandChat reads usePathname() + useSearchParams() and POSTs to /api/ai/command with { message, conversationId?, channel:'web', metadata: { currentPage, pageQuery } }.
  2. Route handler (routes/ai.ts) validates the body, extracts userId from the bearer token, and calls aiAgentService.processMessage(...).
  3. agentService.processMessage:
    • Loads conversation history from conversationStore (Redis).
    • Builds UserContext from the DB (role, agentId, balance).
    • Builds a system prompt (generateSystemPrompt) parameterised on role
      • metadata. Agent + admin prompts include the agent-tool routing rules and the template-first / custom-fallback report rule.
    • Filters toolRegistry to tools allowed for the user's role (getToolsForRole). For agents that's: matchTools, betTools, accountTools, navigationTools, and agentTools.
    • Calls Claude with the message + tools.
    • On tool_use blocks: dispatches via toolRegistry.executeTool, which re-verifies the role + auth, runs the tool, and returns a ToolResult.
    • Appends tool results to the message history; loops up to 10 iterations.
    • Aggregates UI components from every tool call into a list; sets requiresAction if the last UI is a confirm card (bet_slip, confirmation_dialog, diff_card, settlement_summary).
  4. Response is { text, ui[], conversationId, toolsUsed[], requiresAction, actionData }.
  5. Frontend appends the assistant message, renders each UI component via the switch in ai-chat/ui/index.tsx, and (if a confirm card has actions) wires the Confirm / Cancel buttons.
  6. Confirm sends { conversationId, actionType, actionData } to /api/ai/command/confirm. The actionType is read from action.payload.actionType when the card provides it (Phase 3 cards) or from the global pendingAction state (legacy bet_slip flow).

For destructive actions the executor runs server-side, applies the mutation, writes an audit log line, and returns a follow-up AgentResponse for the conversation.


4. Tools — what each one does and why

Read-only (no confirmation; isDestructive=false)

ToolWhat it doesNotes
agent_searchPlayersFuzzy name-match for players inside the caller's downline. Returns a player_picker UI.The system prompt tells the LLM to call this first whenever the user names a player ambiguously. The LLM is forbidden from inventing IDs.
agent_getDownlineWalks the agent tree depth-first, includes child agents and their player users. Returns a downline_tree UI.Depth capped at 6. Uses BFS-expand on parent_agent_id.
agent_getTakeReads TakeBalance.currentTake per downline player. Joins with User.creditLimit. Returns a take_table.Sign convention: positive = player owes agent; negative = agent owes player.
agent_getLiveExposureAggregates V3ExposureLedger rows for the downline. Optional fixture or market filter. Returns an exposure_card.Paisa → FP conversion happens in the tool.
agent_getBetHistoryOrder.findMany over downline userIds with optional player / fixture / status / date filters. Returns a bet_table.Capped at 50 rows; default sort = most-recent first.

Destructive (confirm via /command/confirm; isDestructive=true)

All destructive tools follow the same shape: on first call, build a confirmation card with the action payload embedded in the confirm button. On confirm, the route looks up the actionType and routes to the matching executor. The executor re-verifies everything from scratch (defense in depth — never trusts the actionData).

ToolCardExecutorNotes
agent_editLimitdiff_card showing field, before, afterexecuteEditLimitAllowed fields: creditLimit (FP) / perClickWin / aggregateDailyWin / minStake (paisa). Stale-read guard: rejects if the DB value drifted from the snapshot.
agent_settlePlayersettlement_summary (direction + components)executeSettlePlayerDelegates to accounting/transferSettleService.createSettlement so financial-integrity invariants (SELECT FOR UPDATE, sign branching, CL/BP floors) stay centralised. Sign-flip guard: rejects if the take direction changed.
agent_createPlayerdiff_card showing "(does not exist) → p1 (balance N)"executeCreatePlayerWraps prisma.$transaction with balance check + user.create + agent debit + PointAllocation insert — same pattern as POST /agent/invite-player. Agent-context drift guard.

Reports (isDestructive=false)

ToolWhat it does
agent_generateReportPicks one of 4 canonical templates and a format. Each template has a curated column set; commission is included where relevant.
agent_customReportLLM writes a structured spec; the validator enforces the per-source allowlist; the builder turns it into a downline-scoped Prisma query + in-memory groupBy + render.
agent_shareReportOwnership-check + downline-check, then writes a recipient into the artifact's sharedWith list and emits a notification with the download URL.

Hedge bet — no new tool

The existing placeBetTool is gated to agent + admin and orderService.placeOrder auto-resolves agent → self-player at line ~785. The system prompt has a clause telling the LLM: "use the standard placeBet flow for hedges; do not look for a separate hedge tool."


5. Reports engine

Templates (reports/templates.ts)

Four canonical shapes, each built from Prisma queries against the downline-scoped data:

  • per_player_pnl — Order aggregation per userId (count, sum stake, sum P&L, open count) + commission column from CommissionRecord. Formats: xlsx | csv | pdf.
  • bet_log — Raw Order rows with player / fixture / market filters. Formats: csv | pdf (xlsx omitted — bet logs are too large in xlsx).
  • settlement_logTransferSettlement rows. Formats: xlsx | csv | pdf.
  • pnl_by_match_market — Aggregation of CommissionRecord rows per (fixture × market). Formats: xlsx | csv | pdf.

PDF rendering is a single helper (renderTableToPDF) shared by all templates; it takes a title + column defs + rows + optional totals row and paginates automatically.

Custom (reports/customReportSpec.ts + customReportBuilder.ts)

The single most important security boundary in the system.

The LLM produces a structured spec from natural language. Before anything reaches Prisma, the spec is validated against an explicit allowlist:

  • Sources — only orders | commissions | settlements | take. Each source has a curated field map in SOURCES (see customReportSpec.ts). Cross-agent data (e.g. peer agents' creditLimit) is deliberately not exposed.
  • Per-field flagsfilterable, groupable, selectable, aggregatable. selectionName for example is selectable but not groupable (high cardinality).
  • Op compatibilitycontains only on strings, gt/gte/lt/lte only on numbers / dates, between requires a 2-element array, in / nin require an array.
  • Aggregationssum/avg/min/max require an aggregatable numeric field. count always allowed.
  • Mixing rules — when groupBy is set, every raw-field column must appear in groupBy or be wrapped in an aggregation. having requires groupBy and references column aliases (post-aggregation).
  • Caps — max 20 columns, max limit 50000, filename charset [a-zA-Z0-9_\-.]+ only (no path traversal).
  • .strict() zod variants — the column-shape union uses .strict() so zod can't silently drop a misplaced agg key and let it bypass the aggregation check.

The builder:

  1. Translates the validated filters to a Prisma where (typed values only — nothing the LLM produced is ever interpolated as a string).
  2. Adds a downline scope to the where (BFS-expanded agent + player IDs). The caller cannot exfiltrate sibling data even if the spec says otherwise.
  3. Fetches up to limit + 1 rows; sets truncated: true on overflow.
  4. Aggregates in-memory via Map<groupKey, rows>.
  5. Applies having + sort in memory.
  6. Renders via the same xlsx / csv / pdf paths as the templates.

The exact spec is logged on success with userId, requestId, traceId — so we can audit "what did the assistant just query."

Artifact lifecycle (reports/artifactStore.ts)

  • Bytes — on disk at /tmp/ai-reports/{artifactId}.{ext}. The path is regex-stripped to [a-z0-9-] before file-system construction (path-traversal guard).
  • Metadata — Redis at ai-report:meta:{artifactId} with 24h TTL. Shape: { artifactId, ownerUserId, template, format, filename, mimeType, sizeBytes, createdAt, expiresAt, sharedWith? }.
  • ShareshareArtifact(id, recipientId) adds to sharedWith[] and preserves the existing TTL (sharing does not extend expiry).
  • DownloadGET /api/ai/reports/:artifactId requires auth, validates canDownload(meta, userId) (owner OR explicit share), streams the file. 404 if metadata missing or expired; 403 if unauthorised.
  • Volume mountdocker-compose.dev1.yml maps a named volume ai-reports to /tmp/ai-reports/ so artifacts survive container recreates. The backend Dockerfile pre-creates this directory with hannibal ownership before the USER switch, so the volume's first mount inherits the right perms (otherwise the container hits EACCES on first write — see pitfalls).

6. Confirmation flow

The platform already had a /api/ai/command/confirm route for player bets. I extended it for the agent destructive tools rather than introducing a new confirm endpoint (the nonce-based design in the Phase 0 spec is documented for future work but not shipped — the stale-read / drift guards in the executors cover the same class of mutation-replay risks).

How it works:

  1. Destructive tool returns ui: { type: 'diff_card', actions: [confirm, cancel] }. The confirm button's payload carries { actionType, actionData }.
  2. The aggregator marks requiresAction = true for diff_card and settlement_summary types.
  3. Frontend CommandChat.handleAction has two-stage routing:
    • If the action.payload carries actionType + actionData (Phase 3+ cards), call /command/confirm with those values directly.
    • Otherwise (legacy bet_slip flow), use the global pendingAction state.
  4. The route maps actionType → the matching executor:
    • betexecuteBet
    • balance_adjustmentexecuteGivePoints
    • limit_editexecuteEditLimit
    • settle_playerexecuteSettlePlayer
    • create_playerexecuteCreatePlayer
  5. Executors re-validate (caller's agent profile, downline membership, value/take hasn't drifted), apply the mutation, write an audit log line, and return the follow-up AgentResponse.

The actionType union is duplicated across:

  • backend zod schema (routes/ai.ts confirmActionSchema)
  • backend agentService dispatch (confirmAction function)
  • frontend aiCommandApi.ts ConfirmRequest
  • frontend aiCommandStore.ts pendingAction.actionType
  • frontend CommandChat.tsx payload narrowing

When adding a new destructive tool, all five places need the new union member (vitest will catch most of it via the agentTools.test.ts file).


7. UI cards (frontend)

Each tool result that the LLM should not paraphrase comes back with a ui: { type, data, actions? } block. The frontend has a single switch in components/ai-chat/ui/index.tsx mapping type to a renderer. Agent cards live in AgentCards.tsx:

UIComponentTypeRendererNotes
player_pickerPlayerPickerCardUsed by agent_searchPlayers when there are multiple matches. Each candidate fires a custom action with the playerId.
take_tableTakeTableCardSign-coloured take column with explanatory footnote.
exposure_cardExposureCardTop-line totals + horizontal bars per scope.
pnl_tablePnLTableCardReserved for future usage; not produced by current tools but renderer is ready.
commission_tableCommissionTableCardReserved for future.
settlement_tableSettlementTableCardReserved for future.
bet_tableBetTableCardCompact bet list with summary totals.
downline_treeDownlineTreeCardCollapsible tree; auto-expands first two levels.
diff_cardDiffCardComponentBefore / after side-by-side with confirm + cancel buttons.
settlement_summarySettlementSummaryCardAmount + direction + components + confirm.
report_artifactReportArtifactCardFilename + size + expiry + download button (href to /api/ai/reports/:id).

Shared helpers in AgentCards.tsx: fmtPts, signColor, Stat, Card. All money values arrive in FP (points), never paisa — the backend tools convert.

Agent FAB mode

AIPanel and AIMobileSheet accept a mode: 'tabs' | 'command' prop.

  • tabs (default, player side) → renders Chat / Commands / Analysis tab strip and content per active tab.
  • command (agent side) → no tab strip; renders CommandChat directly. Header label changes to "Agent Assistant" for clarity.

The agent layout (app/agent/layout.tsx) passes mode="command" to both. The previous "snap to commands on open" useEffect is gone — the new mode handles it structurally.

Page-context injection

CommandChat reads usePathname() + useSearchParams() and passes them via metadata.currentPage + metadata.pageQuery on every sendCommand. The system prompt teaches the LLM to read these for context (e.g. when on /agent/downline?agentId=X, prefer X as the playerId unless overridden).


8. Notifications

ai/notifications/agentNotifier.ts wraps the existing NotificationService so all the websocket / push / DB plumbing stays centralised. Three triggers:

FunctionFires atWhat it does
notifyLargeBetroutes/orders.ts post-success hookIf stake ≥ 500 FP, resolve player → upline agent, emit agent_alert to the agent.
notifyPlayerLogin/auth/credential-login + /auth/dev-login successPer-(agent, player) 24h Redis dedupe; emit Player active if not muted + not deduped.
notifyMarketSettledservices/settlement.ts processSettlement per-market callAggregate CommissionRecord rows for the market by upline agent; emit one notification per agent with Net P&L: ±X.XX FP.

All three are fire-and-forget — failures log but never block the calling path (order placement, login response, settlement). Per-event mute keys live at ai-agent-notif:mute:{agentUserId}:{event} and control whether the next event fires for that channel.


9. Security model

Layered defenses; each must be assumed broken when reasoning about the next.

  1. Role gate at the tool registry — every tool declares allowedRoles and the registry filters per-request before sending the tool list to Claude. A player can never see, never mind execute, the agent tools.
  2. Auth gate — every tool sets requiresAuth: true; the registry rejects calls with no user context.
  3. Per-tool re-checks — every agent tool calls getCallerAgent, which reads the live caller's agentId from the DB. The tool then BFS-expands the downline (agent IDs only; the caller cannot see sibling trees).
  4. Stale-read / drift guards on every destructive executor — re-fetch the target's current state and compare with the snapshot taken at diff time. If the value moved, refuse.
  5. Atomicity — destructive mutations delegate to existing services that use prisma.$transaction with SELECT FOR UPDATE (settlements) or atomic conditional updates (limit edits, balance debits).
  6. Custom report allowlist — see §5. Nothing the LLM produces is ever interpolated into raw SQL.
  7. Path traversal — artifact IDs are regex-stripped before file-path construction. Filenames in custom reports are validated by a regex that excludes / and ...
  8. Audit logging — every destructive action and every custom-report spec is logged with userId, requestId, traceId, and the action payload. Logs are persisted via the standard backend logger (Timescale + stdout).
  9. PII — the assistant has access to player display names, IDs, balances, take. We assume these end up in conversation history (Redis, TTL'd) and Anthropic API logs. Don't log raw card / bank numbers anywhere; the platform doesn't currently route those through the AI surface.

10. Testing strategy

  • Unit tests (vitest) at backend/tests/ai/.
  • agentTools.test.ts mocks the Prisma client (via vi.mock) and the notification service. Covers role gating, downline scoping (BFS call shape), paisa→FP conversion, empty-data shapes, input validation, destructive-tool stale-read guards, executor re-validation, delegation to createSettlement, share-flow ownership / downline / notification emission.
  • customReportSpec.test.ts has zero mocks — it tests the validator directly with hand-written specs. Covers schema-level caps (limit, column count, path-traversal filename), allowlist rejections (unknown source, unknown field, op-kind mismatches, non-groupable groupBy, having without groupBy, having on non-existent alias, duplicate aliases), and the happy path (groupBy + agg + having + sort).
  • 69/69 green as of branch HEAD.
  • Integration tests were left untouched — the existing tests/ai/ integration files require env vars and live services to run and are pre-existing failures unrelated to this work.
  • Manual end-to-end on dev1 — every tool exercised via curl against /api/ai/command with a dev-login token, results verified by DB queries or log inspection. Findings recorded in commit messages.

11. Deployment

  • Branchdevagentai tracks origin/devagentai. All work in this effort lives there.
  • Host159.65.94.143 (ssh alias strykrdev), path /root/strykr-dev1. Compose file docker-compose.dev1.yml.
  • Domainhttps://dev1.strykr.io.
  • Standard rebuild
    • ssh strykrdev 'cd /root/strykr-dev1 && git pull origin devagentai && docker compose -f docker-compose.dev1.yml up -d --build --force-recreate backend frontend'
    • Backend rebuild typical ~3 min, frontend ~3-4 min.
    • Containers strykr-dev1-backend and strykr-dev1-frontend.
  • Recreate-race — sometimes the backend container ends up with a hash-prefixed name (e.g. 6759...strykr-dev1-backend) in the Created state alongside the proper one. Cleanup: docker rm {hash}_strykr-dev1-backend && docker compose ... up -d --force-recreate backend.
  • Volumestrykr-dev1_ai-reports named volume holds artifact bytes. Survives up --build. To wipe artifacts manually: docker volume rm strykr-dev1_ai-reports.

12. Pitfalls (things that bit us, worth knowing)

IssueWhereFix
Volume mount inherited root ownership; backend (hannibal UID 1001) couldn't writefirst artifact write after the volume was addedDockerfile pre-creates /tmp/ai-reports with hannibal owner before USER switch; live workaround was docker exec --user root ... chown -R hannibal /tmp/ai-reports.
req.params.artifactId typed string | string[] under strict tsconfigTS build inside Docker (local tsc was lenient)Narrow with Array.isArray() before use.
req.params types in general — Express + strict TS surfaces them as a union even for single-segment routessimilar pattern across new routesAlways narrow at the top of the handler.
pdfkit not installed in node_modules initiallyfirst custom-report deploy after PDF was addednpm install pdfkit @types/pdfkit then rebuild Docker image.
zod default strip silently drops extra keys → agg: 'sum', field: 'status' matched the raw-field branch in the column-shape union, bypassing aggregation validationcustomReportSpec.tsSwitched to .strict() per branch and put the aggregation branch first in the union.
FE container rebuild also recreates the backend containerdocker compose ... up -d --force-recreate frontendThe backend volume is fine; in-flight HTTP requests can be killed mid-flight. Re-issue after rebuild.
Backend container restart wiped /tmp/ai-reports/ files before the volume mount was addedevery deployVolume mount; documented in commit 7beb9bf9.
LLM was using the wrong tab (Chat instead of Commands) on the agent FABfirst FAB landTwo fixes shipped: (a) initial nudge to snap to Commands on open; (b) eventually a structural mode='command' prop that hides the tabs entirely.
notifications.url column doesn't exist on the model; NotificationService.create({url}) silently drops the fieldevery notification we emitWorkaround: keep the URL in metadata.url too. Fix would be a schema migration.
Routing notifyPlayerLogin to the direct upline only (not grand-uplines)by design, but worth flaggingIf a future product call wants grand-upline notifications, walk Agent.parentAgentId chain in resolveAgentUserForPlayer.
Schema-drift TS errors in services/orderService.ts and routes/orders.tsunrelated to this work; pre-existing on the dev branchBuild passes (errors are non-fatal under the current tsconfig's noEmit setup, and the docker build only fails on errors in the touched files in this branch). Ignore until you actually need to fix Prisma model drift.

13. Where to extend

A few things came up in scoping but were deliberately not built. They're ready to pick up:

  1. Bulk per-player JPEG reports — the user asked for it. Needs a chart-render lib choice. Three viable paths:
    • chartjs-node-canvas (depends on canvas native module)
    • sharp + manual SVG generation
    • HTML template + puppeteer screenshot (heaviest but most flexible) Recommend chartjs-node-canvas if charts are simple; puppeteer if the user wants per-player styled summary cards.
  2. Customise report columns — needs a column_picker UI component on the FE and a two-step generate flow (pick template → pick columns → generate). The custom-report tool already covers most of this case if the LLM constructs the spec; the explicit picker would help when the user wants to iterate.
  3. Nonce-based confirmation — the Phase 0 spec describes a strict nonce-bound confirm flow. We shipped the lighter stale-read / drift-guard approach. If product wants the formal nonce protocol (auditable token issuance + single-use), the work is contained in: add a Redis nonce on diff card creation, return it in actionData, require it back on confirm, consume it atomically.
  4. Hedge bet via bet_slip UI — the existing placeBet already works for agents. If you want a dedicated agent_placeHedge tool that always uses the self-player + has a hedge-specific confirm card, it's a thin wrapper.
  5. Settle player with explicit amount picker — currently the LLM has to either accept the full outstanding take or pass an explicit amount in the natural-language request. A two-step "show outstanding → user picks amount" flow would be nicer; needs a new UI card.
  6. Daily downline summary — an end-of-day notification with the agent's net P&L, top players, biggest exposures. Could be a scheduled cron that calls into the existing tools and emits a notification.
  7. platform_admin polish — admins currently get the agent capabilities. Adding "see across all downlines" makes most agent tools work platform-wide. The system prompt has a short ADMIN block but no admin-specific tools yet.
  8. Conversation history persistenceconversationStore is in Redis with the default TTL. For audit / replay, consider piping completed turns into a Postgres table.

14. Quick reference for a future LLM picking this up

  • Branch you're on: devagentai (don't merge into dev or feat/dev1 without consulting the human).
  • All agent-FAB tools live in: backend/src/ai/tools/agentTools.ts.
  • Tool registry is: backend/src/ai/core/toolRegistry.ts (register via backend/src/ai/tools/index.ts).
  • System prompt is in: backend/src/ai/core/agentService.ts inside generateSystemPrompt.
  • The security boundary for open-ended queries is: backend/src/ai/reports/customReportSpec.ts. Don't add fields without thinking about cross-agent leakage.
  • Reports + artifacts: backend/src/ai/reports/.
  • Test pattern: backend/tests/ai/tools/agentTools.test.ts for tools (mock Prisma + notificationService). backend/tests/ai/reports/customReportSpec.test.ts for the spec (no mocks, pure validator tests).
  • FE renderers: strykr-fe/src/components/ai-chat/ui/AgentCards.tsx plus the switch in index.tsx. Add a new card by extending UIComponentType in both types.ts (backend) and aiCommandApi.ts (frontend) and adding a switch arm in index.tsx.
  • To add a destructive tool:
    1. Add the tool + executor in agentTools.ts.
    2. Add the actionType to the union in five places (backend route schema, agentService dispatch, FE api type, FE store type, FE CommandChat narrowing). vitest catches most of it.
    3. Implement the executor with re-validation + stale-read guards. Don't trust actionData; re-read from the DB.
    4. Add tests in agentTools.test.ts.
  • To deploy: ssh strykrdev 'cd /root/strykr-dev1 && git pull origin devagentai && docker compose -f docker-compose.dev1.yml up -d --build --force-recreate backend frontend'.
  • To debug a tool call live: ssh strykrdev 'docker logs strykr-dev1-backend 2>&1 | grep -E "Executing tool|Tool .* completed|agent_" | tail -30'.