Articles

Technical articles on production LLM systems: data integration, agents that stay up, evaluation, and reliability.

What Bank-Grade Key Management Teaches You About Agent Eval Harnesses

Five disciplines from banking security — durable state, deterministic failure, dual control, audit trails, and recovery playbooks — applied to LLM agent evaluation.

Apr 18, 20265 min read
#Agent Evals#Verifiable Systems#LLM Production#Banking#MCP

Where CrewAI Breaks in Production — and What to Use Instead

The role abstraction in CrewAI works for demos and struggles under production load. Four specific failure modes and the LangGraph patterns that replaced them.

Jan 15, 20258 min read
#CrewAI#Multi-Agent#LangGraph#Production

State as the API: LangGraph After Three Rewrites

The state schema is the most consequential design decision in LangGraph. Three iterations on how to model it — and why channels with reducers is the right primitive.

Jan 8, 202512 min read
#LangGraph#LLM#Multi-Agent#Orchestration

RAG in Production: Fix Chunking and Re-Ranking Before Touching Embeddings

Most RAG pipelines fail on chunking or re-ranking before they fail on embedding quality. A diagnostic-first framework for finding and fixing the right bottleneck.

Dec 20, 20249 min read
#RAG#Retrieval#LLM#Production