95 minutes. Now under 5. 200 people now query a database just by talking to it. Every interview finding, traced to the exact word that supports it.
Hello!

I'm David,

AI Solutions Engineer

David compressed a 95-minute manual finance process to under 5 minutes in production. He's the kind of engineer who ships infrastructure, not demos.

Paysoko Systems — Nairobi
David Mwiti Muthiru
David Mwiti Muthiru

Good — let me give you what actually matters.

I'm an AI engineer with 2+ years building production LLM systems — not prototypes. My work at Paysoko Systems runs live, serves 1,000+ concurrent users, and has processed thousands of credit memos with zero financial errors.

95 min → <5 min — LangGraph credit memo pipeline, production 200+ non-technical staff querying a 600-table ERP in plain English 40% relevance improvement on RAG pipeline via systematic benchmarking Triple-layer SQL security — zero DDL/DML, zero cross-tenant leaks Based in Nairobi, Kenya (UTC+3) — open to remote roles worldwide

Tell me what you're building — I'll tell you if and how I can help.

My background is in production AI systems: multi-agent LangGraph pipelines, RAG architectures with real evaluation frameworks, NL2SQL over large schemas, and voice AI with structured data extraction.

I've shipped systems that had to be right, not just impressive — financial data, legal grounding, structured interviews where every claim needs verbatim evidence. If your problem has those kinds of stakes, that's exactly my context.

Anchor is the voice AI interview engine I've been building as a product. It conducts structured interviews in a browser — with AI probing, follow-up questions, and an evidence layer that no other interview tool has.

Every claim extracted from an interview is anchored to the exact verbatim text that supports it — turn ID, character-level offsets, rubric coverage score. No floating insights. No hallucinated summaries. If a finding can't be traced to what the participant actually said, it doesn't appear.

ResearchKit — structured qualitative research interviews for UX teams and market researchers HireSignal — STAR-format recruitment screening with evidence scorecards and CV claim verification Local MVP is demoable now — reach out to book a walkthrough

Welcome. Here's the short version:

I'm David — I build AI systems that have to work in production, not just in demos. My systems run in a live fintech company in Nairobi, serve over a thousand users, and have zero tolerance for hallucinated financial data. I also built Anchor — a voice AI interview engine with an evidence layer that ties every finding to the exact words that support it.

Pick a direction below — or just ask me anything.

I'm an AI engineer based in Nairobi, Kenya, specialising in production multi-agent systems, RAG pipelines, and voice AI.

I joined Paysoko Systems as the founding AI engineer and built the company's entire LLM infrastructure from scratch. Two systems I'm most proud of: a LangGraph pipeline that took credit memo production from 95 minutes to under 5, and a schema-aware NL2SQL engine that lets 200+ non-technical staff query a 600-table ERP in plain English — both in live production.

Outside my day work, I built Anchor — a voice AI interview platform — and BongaLaw, a RAG-grounded civic AI for Kenyan constitutional law with full retrieval benchmarking via TruLens.

2+ years in production LLM systems Builder of Anchor voice AI interview engine Open to remote roles worldwide Nairobi, Kenya — UTC+3

My stack is built around shipping production AI systems — not proof-of-concepts.

LLM & Agents

Backend & Infrastructure

Vector & Evaluation

Here's what I've shipped. I'll walk you through them one at a time.

Stateful, rubric-driven voice interview engine. Conducts structured interviews in-browser with AI probing and follow-up questions. Evidence layer: every finding tied to verbatim text — turn_id, char_start, char_end, anchor_text. Verticals: ResearchKit (qualitative UX/market research) and HireSignal (STAR-format recruitment screening with CV claim verification). Stack: FastAPI + WebSockets, PostgreSQL, Redis, multi-provider STT/TTS/LLM abstraction.

RAG-grounded Q&A over Kenyan constitutional law. Every response backed by retrieved source snippets with article citations. Multi-agent debate system with SSE streaming. Benchmarked three retrieval strategies (sentence-level, chunk-merge, hybrid) using TruLens — Answer Relevance, Context Precision, Groundedness — with PCA and t-SNE embedding visualisation. Stack: Next.js, FastAPI, ChromaDB, Anthropic Claude API.

5-phase LangGraph multi-agent pipeline in production: Extraction → PII Masking → Financial Analyst → Document Architect → QA Daemon. Reduced credit memo production from up to 95 minutes to under 5 minutes. 100% deterministic financial ratio accuracy — QA Daemon re-computes every ratio independently and raises a hard error on any discrepancy. PII intercepted before every LLM call. Dual deployment: cloud and air-gapped Ollama. Serves 1,000+ concurrent users.

Natural language to SQL over a 600+ table MySQL ERP. Enabled 200+ non-technical staff across HR, procurement, finance, and lending to query data in plain English. Triple-layer security: AST-level SQL rewriting prevents all DDL/DML and cross-tenant data leakage. Latency: ~4s → under 1.6s via Redis caching and prompt optimisation. 40% improvement in answer relevance scores measured with LangSmith. Live in production.

Multi-agent loan workflow for East African SMEs. Pipeline: Orchestrator → Document Extraction → Underwriting → Pricing → Human Approval. Pauses on missing data, requests it, resumes on repair — no silent failures. Built on Google ADK architecture, localized for Kenya. Stack: Google ADK, Gemini, Python, structured output schemas.

  • About Anchor
  • Let's Talk
  • About Anchor

Open to remote engineering roles worldwide and conversations about Anchor. Best way to reach me is email — I respond within 24 hours.

david.mwiti.muthiru@gmail.com
+254 716 003 852

Message sent.

I'll respond within 24 hours. If it's urgent, WhatsApp is faster: +254 716 003 852

Hey — I'm David's AI pitch agent.

Ask me what you're building or what you need, and I'll tell you whether David is the right person — and specifically why. Or use the nav below.

My main production work is at Paysoko Systems in Nairobi, where I joined as the founding AI engineer. I built the company's entire LLM infrastructure from scratch — a 5-phase LangGraph pipeline and a schema-aware NL2SQL engine, both live in production serving 1,000+ users.

Before that: building Anchor and BongaLaw as owned projects, with full evaluation frameworks rather than vibes-based testing.

Here's my full CV as a PDF.

Let me think about that...

David Mwiti AI Engineer