A \ V — Writing

AI & Systems

Agentic AI is Not a Better Chatbot — It is a Different Paradigm

The conversation is stuck on chat interfaces. The unit of work has moved from the query to the goal. Here is what that actually means for how we build.

Feb 20268 min

Technology

MCP and the Composable AI Stack

Model Context Protocol is not just a tool integration format. It is the first serious attempt to standardize how AI agents interact with the world. What this means for how we architect systems.

Jan 202610 min

Road

Route 66 End to End — What the Mother Road Actually Teaches

I drove every intact stretch of Route 66 from Chicago to Santa Monica. The road is not a tourist attraction. It is a cross-section of American time, frozen in asphalt.

201014 min

Ideas

Complexity is a Design Failure

Every unmaintainable system I have worked on became that way because someone confused complexity with sophistication. The argument for simplicity as a moral position in engineering.

Dec 20257 min

Ideas

The Fermi Paradox as the Most Useful Fact I Know

If the universe is vast and old and life is not rare, where is everyone? Every proposed answer implies something extraordinary about our situation. The silence is data.

Nov 20259 min

Technology

Evaluation is the Most Underinvested Part of Every AI Project

Teams spend 80% of their time on model selection and 5% on evaluation. This is backwards. The case for building evals first, running them continuously, and treating regression as a critical bug.

Oct 20258 min

Road

Lake Superior's North Shore at Dawn

Split Rock Lighthouse, Pictured Rocks, the Boundary Waters by canoe. The Great Lake that looks like an ocean and behaves like one. Notes from the northern edge of the continent.

201811 min

AI & Systems

Alignment is Already Your Problem

Not because superintelligence is imminent. Because the systems being built today are already consequential. The case for treating alignment not as a research concern but as an engineering discipline.

Sep 202510 min

More essays in progress✍️

The Gap Between Demo and Production is Wider Than Anyone Admits

A demo that works 90% of the time is impressive. A production system that works 90% of the time is broken. This sounds obvious written down. In practice, the distinction collapses in every sales meeting, every board presentation, every proof-of-concept review I have ever attended.

The gap is not a secret. It is just systematically ignored because the incentives on both sides of the table push toward ignoring it. The vendor wants to close the deal. The buyer wants to believe. The demo is optimized for the best case. Production is defined by the worst.

The hard 10% is where real engineering lives. The edge cases, the adversarial inputs, the ambiguous instructions, the cascading errors, the 3am pages. Most teams underinvest in the error handling, observability, and fallback design that separate a compelling prototype from a reliable system — not because they are lazy, but because these things are invisible in demos and visible only in production.

What does the gap actually look like? In healthcare AI — my primary domain — it looks like this: a prior authorization model that performs at 94% accuracy in testing and 76% in production, because the test set was cleaned and the production data is not. It looks like an agent that handles the happy path beautifully and fails silently on the 15% of cases that require a document format the training data never included. It looks like a system that works perfectly until the downstream API it depends on starts returning 429s, and nobody built a retry strategy or a graceful degradation path.

Three things separate teams that close the gap from those that do not. First: they build evals before they build features. You cannot know whether you are improving a system if you have no way to measure it. Second: they instrument everything. If you cannot observe it, you cannot debug it. Logs, traces, and structured outputs are not overhead — they are the system's ability to explain itself. Third: they design for the exit condition in every agent loop. What happens when the goal cannot be reached? What happens when the tool fails? What happens when the model returns something unexpected? The teams that answer these questions before they ship are the ones whose systems stay up.

The gap is closable. It requires different skills than demo-building — less creativity, more rigor; less novelty, more robustness. The engineers who do this work are rarely the ones in the conference room presentations. They are the ones keeping the system running at 3am. They deserve more credit than they get, and their concerns deserve more weight in the architecture discussions that happen before the demo is ever built.