Technical documentation

Mock GenAI Implementation

How the five AI features work — the wire contracts they speak, the boundary between mock and real inference, the graceful degradation when the service is down, and the single "AI suggests / you confirm" pattern they all share.

Service tm-worksmart-ai-service · FastAPI Caller the api only — never the browser Contract Zod-mirrored in @worksmart/types

The mock / real boundary
The five features
Wire contract
Graceful degradation
AI suggests / you confirm

01The mock / real boundary

One inference service, reached through one seam — the api is its only caller.

All GenAI lives in a separate repository, tm-worksmart-ai-service (FastAPI). The browser never calls it and never holds its token. Instead the api owns a single client layer (apps/api/src/lib/aiClient.ts) that holds the base URL, the SERVICE_TOKEN, and the timeout. Every AI feature is wired the same way: shared Zod types → aiClient → an api route → the web UI.

"Mock" here means the inference is deterministic, not that the wiring is fake. The features are fully end-to-end — they cross the network, validate responses, persist results, and degrade under failure exactly as a real-LLM version would. Swapping deterministic logic for a real model later is a change inside the service, behind an unchanged contract.

Security posture

Because the api is the only caller, the service token stays server-side and the inference layer can be deployed, scaled, and evolved on its own cadence — without ever exposing it to the client.

02The five features

Four mocked deterministically, one real algorithm — all behind the same seam.

Feature	Implementation	Wired via
Smart Categorization	Mocked (deterministic) — suggests a `#tag` for a check-in	`POST /categorize`
Document Analysis	Mocked (deterministic) — extracts fields, persists to `extracted_data`, suggests a status	`POST /documents/:id/analyze`
Natural-Language Search	Mocked (deterministic) — parses a query into structured filters the api executes	`POST /search`
Anomaly Detection	Real — MAD-based modified z-score on per-user hours	`GET /anomalies`
AI Project Status Narrative	Proxied to the service from a project's summary signals	`POST /projects/:id/status-narrative` → `/narratives`

Two are worth detail. Anomaly Detection is the one true algorithm: a modified z-score built on the median absolute deviation of each user's hours, so it flags a check-in against that person's baseline rather than a global average. Natural-Language Search is deliberately DB-free in the service — it returns structured SearchFilters plus a plain-language explanation, and the api executes those filters (tags, user, date, hours, text) over check-ins or documents and resolves relative phrases like "this week".

03Wire contract

One snake_case contract, mirrored as Zod, validated before it's trusted.

The service speaks snake_case on the wire. That contract is mirrored 1:1 as Zod schemas in packages/types/src/ai.ts — a direct reflection of the service's app/schemas.py — so the api validates every AI response before acting on it, and both apps share one source of truth. An inbound X-Request-ID is forwarded (or minted) so a single request traces across both services in the structured logs.

Categorization is non-blocking by design

The flagship "AI suggests" touchpoint must never get in the way of typing. The request is debounced (~400ms), only fires when no explicit #tag has been typed yet, uses a short timeout, and is set to retry: false. A slow or down service simply means the suggestion doesn't appear — check-in creation is unaffected.

04Graceful degradation

When the AI is down, the app stays up. Failures map to gateway codes, never a bare 500.

The shared client raises an AiServiceError that carries a status code: an upstream error becomes 502, a timeout or unreachable service becomes 504, and a malformed body becomes 502. Those map straight to { error } responses. The crucial property is that AI is always additive — with the service stopped, Document Analysis and Search return 504 while the rest of the app keeps returning 200, the project board stays fully usable, and the narrative shows an "unavailable" state.

Verified end-to-end

Against the real stack (Postgres + MinIO + the Dockerized AI service): analyze returned and persisted extracted PO fields, search returned matches for both scopes with the parsed read-back, and with the service stopped both degraded to 504 while everything else stayed 200.

05AI suggests / you confirm

The same posture at every touchpoint — the AI drafts, a human commits.

AI tag suggestion on the check-in form — **Smart Categorization** — "AI suggests `#authentication` 90%" appears as a chip beneath the input. Click to inject the tag; it stays fully editable.

Document analysis with extracted fields and Apply — **Document Analysis** — extracted vendor, total, and line items are shown verbatim, with a suggested status behind an **Apply** button. The reviewer confirms the move.

Generated AI project status narrative — **AI Project Status Narrative** — an on-demand draft renders a health pill (`on_track | at_risk | off_track`) plus a PM-editable summary. The AI drafts; the PM owns the words.

Anomaly detection panel on Insights — **Anomaly Detection** — flagged check-ins are shown with the human-readable reason and robust z-score verbatim ("review, don't auto-correct"). Transparency over silent action.

The throughline

No GenAI touchpoint acts on its own conclusion. A suggested tag is a click, an extracted status is an Apply, a narrative is editable text, and an anomaly is surfaced for review. That single posture is what makes the AI safe to rely on.

Mock GenAI Implementation

Contents

01The mock / real boundary

02The five features

03Wire contract

Categorization is non-blocking by design

04Graceful degradation

05AI suggests / you confirm