catallaxy
The Calculation Lab
Forward showcase concept. This is intentionally marked WIP until the public repo exists and contains code.
- Proves: the knowledge problem is computable
- Status: Designed
Big 6 WIP direction, shipped proof, and every public repository.
The Big 6 stays visible as the build-in-public direction. Existing public repos are organized by relevance: shipped proof first, then production-AI systems, supporting infrastructure, meta repos, archives, and study projects.
These are intentionally marked WIP until their repos exist and contain code.
The Calculation Lab
Forward showcase concept. This is intentionally marked WIP until the public repo exists and contains code.
The Glass Box
Forward showcase concept. This is intentionally marked WIP until the public repo exists and contains code.
Corpus Autopsy
Forward showcase concept. This is intentionally marked WIP until the public repo exists and contains code.
Real-Yield Calculator → Verdict Gallery
Forward showcase concept. This is intentionally marked WIP until the public repo exists and contains code.
The Time Machine
Forward showcase concept. This is intentionally marked WIP until the public repo exists and contains code.
The Operating Console
Forward showcase concept. This is intentionally marked WIP until the public repo exists and contains code.
The strongest existing repos to inspect first.
Production-ready WH-347 compliance platform: deterministic validation decides, the LLM explains, and every decision is traceable.
Five-service WH-347 compliance platform with deterministic validation, LLM explanation, trace propagation, internal service auth, 260 source-collected unit tests, 24 integration tests, and 92 golden-set examples.
Controlled AI agent framework and conceptual ancestor of understudy.
ARIA: Agentic Reasoning & Integration Architecture — controlled AI agent framework with tool registry, approval gates, memory, and execution tracing
Grounded document QA / RAG proof with citations and approval workflows.
RAG-powered document QA system with OCR, template detection, and approval workflows
LLM proxy with routing, guardrails, cost control, and fallback.
Enterprise LLM proxy with routing, guardrails, cost control, and fallback
Agent observability and replay SDK with cost attribution.
Observability and replay SDK for agentic AI workflows with cost attribution
Regression testing framework for RAG and agentic AI.
Regression testing framework for RAG and agentic AI with compliance rule engine
Autonomous issue-to-PR workflow with a safety boundary.
AI agent that reads GitHub issues, plans code fixes, applies edits, runs tests, and opens pull requests autonomously
Unified AI operations platform with microservice architecture.
Unified AI operations platform with microservices architecture
Document ingestion, parsing, chunking, and embedding pipeline.
Multi-format document ingestion pipeline: PDF, DOCX, HTML parsing, semantic chunking, and pgvector embedding storage
RAG evaluation harness: hit-rate, MRR, faithfulness, batch evals.
RAG evaluation framework: hit-rate, MRR, faithfulness scoring, and async batch evaluation with golden question datasets
Self-hosted LLM cost and latency observability.
LLM observability SDK: track token costs, latency, and model usage per request with a FastAPI dashboard
Local knowledge-base API with backlinks and citation-grounded chat.
Markdown vault API: wikilink parsing, bidirectional backlinks graph, keyword search, and citation-grounded chat over local notes
LLM-powered support simulator with policy grounding and prompt-injection tests.
LLM-powered customer support simulator: configurable personas, scenario scripting, policy grounding, and prompt injection detection
DAG-based async workflow orchestrator and YAML workflow DSL.
DAG-based async workflow orchestrator with Celery, dependency resolution, retry logic, and a YAML workflow DSL
GitHub profile README repo.
GitHub profile README: public face for the current positioning, Big 6 WIP direction, and public repo catalog.
RPG systems simulator for combat, loot, XP, and Monte Carlo balancing.
TypeScript + Python hybrid RPG simulator: combat rounds, weighted loot drops, XP curves, and Monte Carlo battle analysis
Hermes Agent fork / agent system reference.
The agent that grows with you
Shared Python foundation for operator-systems projects.
Shared Python library: logging, database, Redis, errors, LLM client — foundation for all operator-systems projects
Reusable FastAPI/Celery/Alembic/pytest project template.
Standard project template for operator-systems portfolio: FastAPI, Celery, Alembic, pytest, ruff, Docker Compose
Small JavaScript testing / CI study project.
Study project — palindrome checker with test suite. JavaScript. Used for learning CI/CD and unit testing patterns.
Event ingestion and WebSocket live dashboard stack.
High-throughput event ingestion API with in-memory ring buffer, JSONL persistence, and WebSocket live-streaming for real-time dashboards
AI learning and experimentation repo.
AI learning and experimentation repo — structured study notes, code experiments, and research on RAG systems, agent architectures, and LLM evaluation methods.
Archived TypeScript predecessor to WCP V5.
Archived — TypeScript predecessor. See WCP-Compliance-Agent-V5 for the current Python monorepo.
Archived Python predecessor to WCP V5.
Archived — predecessor to WCP V5. Three-service payroll compliance system. See WCP-Compliance-Agent-V5 for current version.