Flagship Project

WCP Compliance Agent

Federal payroll validation with three layers of proof. Every decision cites the law, every outcome has a paper trail, and the math is always verifiable.

V5.0.0 — Current Flagship

The Problem

Federal construction contractors must submit weekly certified payrolls proving they pay workers the legally mandated prevailing wage under the Davis-Bacon Act. Manual review is slow, expensive, and error-prone. Most automated tools treat compliance like a chatbot problem — ask a question, get an answer, hope it's right.

That doesn't work when the Department of Labor audits you three years later.

V5 — Current Flagship

V5 is a clean rebuild from lessons learned across V2, V3, and V4. Five services, each with a single responsibility, a distinct failure mode, and a clear reason to change. The LLM never writes to the database. Deterministic validation is the source of compliance truth.

Architecture

Web (React 19) :5173 — Upload UI, decision display ↓ Gateway (Hono) :3000 — Auth, routing, SSE, security boundary ↓ Agent (Vercel AI SDK + Mastra) :3001 — LLM orchestration, verdict synthesis ├→ Compliance Core (FastAPI) :8000 — Deterministic extraction + validation └→ Data Platform (FastAPI) :8001 — Persistence, audit events (only service that writes to DB)

Pipeline

EXTRACT — Compliance Core parses WH-347 PDF/text into structured ExtractedWCP
VALIDATE — Rule engine runs 5+ checks per employee against DBWD federal rates
VERDICT — LLM agent synthesizes verdict with RAG context + statute citations
TRUST — 4-component weighted score (35/25/20/20) on decision quality
PERSIST — Data Platform creates DecisionRecord + AuditEvent atomically

Key Design Rules

Agent never writes to the database — Returns TrustScoredDecision; Data Platform creates official records
Deterministic validation is the source of compliance truth — LLM adds explanation and citations, not correctness
Every decision is traceable — x-request-id + x-trace-id propagate through all 16 pipeline steps
Mock mode with zero dependencies — VITE_MOCK_API=true LLM_MODE=mock runs the full stack locally

View V5 Repo Read the Case Study

Past Versions

V2 proved the concept — a three-layer compliance pipeline where the LLM cites the law, doesn't change the math.

V3 took it production — three-service split, Mastra orchestration, 413 tests, full golden-set evaluation.

V5 is the clean rebuild — five services by responsibility, mock mode with zero deps, 271 tests.

View V2 Repo View V3 Repo

What This Demonstrates

This is not a chatbot. This is regulatory AI infrastructure — the kind of system where failure has real consequences (back wages, debarment, class-action lawsuits). It demonstrates:

Service isolation by responsibility — V5's five-service monorepo means each service has a distinct failure mode, test strategy, and scaling pattern. No service does double duty.
Deterministic validation is the source of compliance truth — The LLM doesn't do the math. It validates the math and cites the law.
Trust scoring — 4-component weighted score (35/25/20/20) routes decisions to humans when certainty is low.
Audit trails — Every decision is traceable from input artifact through all 5 pipeline steps to a persisted record with audit events.
Mock mode with zero dependencies — VITE_MOCK_API=true LLM_MODE=mock runs the entire stack locally. No API keys, no database.

Key Stats

271

Tests, 0 failures

Services (V5)

Golden Set Examples

Pipeline Steps

Domain

Federal Compliance
Davis-Bacon Act
40 U.S.C. § 3142
Prevailing wage validation