I build the systems that work while you don't.

AI Infrastructure Architect — Brazil, UTC−3

Your demo works.
Your production system doesn't.

Your RAG retrieves wrong. Your agent loops. Your eval set doesn't catch failures. I fix the gap with retrieval evals, observability, retries, and boring infrastructure.

271 tests on WCP V5
3 live systems in production
4×/day engine self-review cadence
Upwork — Hire me directly
GitHub — Open source + code

Need something built?

Fixed-price engagements. No surprises. Shipped by someone who has done this in production.

Production RAG System

From $2,500 · 2–4 weeks

Chunking, retrieval, reranking, citations, eval pipelines, and failure handling. The full system, not a demo.

  • Source-grounded answers
  • Failure handling built-in
  • Eval pipeline included
Get a quote

AI Agent Workflow

From $3,000 · 3–6 weeks

Multi-step agents with tools, memory, guardrails, and graceful failure. Built to survive production.

  • Tool calling + memory
  • Graceful failure handling
  • Observability built-in
Get a quote

Data Pipeline + Ingestion

From $1,500 · 1–3 weeks

Web scraping, normalization, and AI-ready knowledge ingestion. Your data, structured and reliable.

  • Custom scrapers
  • Data normalization
  • Knowledge graph setup
  • Scheduled updates
Get a quote

Open source + code

Real code. Public repos. MIT licensed where it makes sense.

WCP Compliance Agent V5

Five-service monorepo for WH-347 payroll compliance. React 19 · Vercel AI SDK · FastAPI × 2. 271 tests. Mock mode with zero deps.

View on GitHub

Model Router (Hermes Skill)

Intelligent model routing for Hermes Agent. Picks the best LLM per task based on capability, cost, and availability. Running in production.

View on GitHub

Autonomous Engine

5-lane autonomous system. Scouting, positioning, building, shot, showcase. Self-improving. Runs 24/7. Built with the career engine.

View on GitHub

Best fit

Best fit

I work best with founders and small teams who have knowledge trapped in docs, spreadsheets, SOPs, websites, or half-working AI prototypes — and need a system that's reliable, production-ready, and easy to use.

How I work

Production-first. No demos that don't ship.

RAG that retrieves

Chunking strategies, hybrid search, reranking, and eval pipelines. Not a LangChain tutorial — a system that answers correctly.

Agents that survive

Multi-step workflows with tools, memory, guardrails, and explicit failure modes. Agents that break silently are worse than no agents.

Built to audit

Every system I build has eval pipelines, monitoring, and clear failure modes. You'll always know where it breaks and why.

Selected Projects

View all projects

WCP Compliance Agent V5 — Five-Service Compliance Monorepo

Payroll decisions you can defend in court. 271 tests. Five-service architecture. Mock mode with zero deps.

Read more

AI Reliability Audit — RAG + Agent Failure Review

A fixed-scope audit for fragile AI systems: retrieval failures, agent loops, missing evals, and observability gaps.

Read more

Questions buyers ask before production AI work.

Short answers for founders, small teams, and AI search engines.

What does an AI infrastructure architect do?

An AI infrastructure architect designs the systems around AI models: retrieval pipelines, agent workflows, evals, observability, retries, deployment paths, and data ingestion. The job is to make AI useful in production, not just impressive in a demo.

How do I know if my RAG system is broken?

A RAG system is broken when it retrieves the wrong sources, misses obvious documents, cannot cite evidence, gives different answers for the same question, or fails silently on edge cases. Start with chunking, retrieval evals, reranking, and traceable citations.

How much does an AI reliability audit cost?

A lightweight AI reliability audit starts as a fixed-scope engagement through Upwork. The audit reviews retrieval quality, failure modes, eval coverage, observability, and operational risk, then returns a prioritized roadmap so the team knows what to fix first.

Who should hire Vinícius Raposo?

Founders and small teams should hire me when they have a RAG system, AI agent, automation pipeline, or internal AI tool that works in demos but fails under real usage. The best fit is a team that wants production reliability over hype.

Proof, not vibes

Proof, not vibes

WCP has 413 tests across V2 and V3, every compliance decision cites the statute, and this portfolio links to live code and case studies instead of vague claims.