Is it Just Plumbing? – Jon C. Phillips

The demos look like magic, and I understand why people are excited.

But I’ve been building on the web long enough to recognize the components underneath the marketing, and they’re all things I’ve seen before.

Crons
Webhooks
APIs
Message queues
A database
A chat layer on top

The LLM in the middle is new and genuinely impressive, but the infrastructure carrying it is decades old.

Understanding what these systems actually are, mechanically, matters more than some people realize. Because when your agent does something unexpected in production, you need to know whether the problem is the brain or the pipes.

The Anatomy of “Autonomous”

Strip away the branding and an AI agent is an LLM that’s been given access to external tools with some orchestration logic that determines when to wake it up and what context to hand it.

Here’s what actually happens when an agent “autonomously” handles a task:

A trigger fires.
A cron job runs on a schedule, a webhook catches an event, a user sends a message, or a monitoring tool like Sentry flags an exception. This is the same event-driven architecture we’ve been building since the early days of web development.
Context gets assembled.
The system gathers relevant information from databases, APIs, file systems, and previous conversation history, then packages it into a prompt. This is data fetching and serialization.
The LLM reasons about what to do.
This is the genuinely new part. The model reads the context, decides which tools to call, in what order, and with what parameters. This decision-making layer used to be a tree of if/else statements or a rules engine. Now it’s a language model that can reason about ambiguous situations in ways that hardcoded logic never could.
Tools get called.
The LLM’s decisions get executed through API calls, database queries, file operations, or whatever tools it has access to. This is just HTTP requests and function calls.
Results get evaluated.
The LLM looks at what came back, decides whether the task is complete or whether it needs to make more tool calls, and either loops back to step 3 or returns a response. This is a while loop with a completion condition.

That’s the whole architecture. The intelligence in step 3 is absolutely remarkable, and I don’t want to undersell how different it feels when the decision-making layer can actually understand natural language, interpret stack traces, and reason about ambiguous situations.

But the scaffolding around it is familiar infrastructure that’s been running production systems for years.

A Deloitte analysis of enterprise agent deployments estimated the autonomous agent market at $8.5 billion in 2026 and projected it reaching $35 billion by 2030, but also noted that over 40% of current agentic AI projects could be cancelled due to unanticipated complexity in scaling.

The complexity they’re referring to isn’t the LLM reasoning.

It’s the plumbing around it, including the orchestration, the state management, the error handling, and the integration work that anyone who’s built distributed systems will immediately recognize.

The Protocols That Connect the Pipes

Two standardization efforts are reshaping how agents connect to the world around them, and both of them reinforce the plumbing metaphor.

MCP (Model Context Protocol) was introduced by Anthropic in late 2024 as an open standard for connecting AI models to external tools and data sources.

Before MCP, every integration required custom code. Connecting an agent to your Slack, your database, and your CRM meant building three separate adapters with their own authentication, error handling, and maintenance burden.

MCP replaces that with a single protocol that any agent can speak, and it’s been adopted by OpenAI, Google, and thousands of tool providers since launch.

The analogy everyone uses is USB-C for AI, and for once the marketing metaphor is actually decent.

MCP standardizes the connection between the brain (the LLM) and the hands (the tools). Before USB-C you needed a different cable for every device. Before MCP you needed custom integration code for every tool.

The pattern is incredibly similar.

A2A (Agent-to-Agent Protocol) was launched by Google in April 2025 and donated to the Linux Foundation shortly after.

Where MCP handles agent-to-tool communication, A2A handles agent-to-agent communication.

It allows agents built on different frameworks by different vendors to discover each other’s capabilities, delegate tasks, and coordinate work without sharing their internal state or memory.

Protocol	Introduced By	Purpose	Analogy
MCP	Anthropic (Nov 2024)	Agent-to-tool communication	USB-C for connecting devices
A2A	Google (Apr 2025)	Agent-to-agent communication	HTTP for web services

Together, MCP and A2A form the standardized plumbing layer that makes multi-agent systems possible at scale.

An agent uses MCP to connect to its tools and A2A to collaborate with other agents. The protocols are built on existing web standards like HTTP, JSON-RPC, and Server-Sent Events, which means they plug into infrastructure that every web developer already understands.

This is the part that matters for builders.

The plumbing is getting standardized, which means the integration work that used to take weeks is shrinking to days or hours. Setting up an agent that can read your GitHub issues, query your database, and post to Slack is becoming a configuration problem rather than an engineering problem.

What’s Genuinely New

I want to be clear about what I think is actually remarkable here, because “it’s just plumbing” can sound dismissive and that’s not the point.

The LLM reasoning layer is legitimately new and powerful.

The ability for a model to look at a Sentry error, read the stack trace, understand the surrounding codebase context, and propose a coherent fix is something that didn’t exist three years ago.

The reasoning capabilities are also progressing along what Deloitte describes as an “autonomy spectrum” that ranges from human-in-the-loop (the human approves every action) to human-on-the-loop (the human monitors but doesn’t approve each step) to human-out-of-the-loop (fully autonomous).

Most production deployments in 2026 are still firmly in the first two categories, and the most advanced organizations are just beginning to experiment with the third.

What’s new is the brain.

What isn’t new is everything that brain sits inside of. The event triggers, the API calls, the database queries, the state management, the error handling, the retry logic, the logging, the monitoring. All of that is infrastructure that’s been battle-tested for decades, and recognizing that should give you confidence rather than confusion.

The Trust Question (Again)

I wrote about the trust problem with autonomous agents a few weeks ago, and the plumbing perspective reinforces the concern.

When you see the architecture for what it is, the trust question becomes very specific. You’re asking whether you trust the LLM reasoning layer to make the right decision about which pipe to open, when, and with what parameters.

The plumbing itself will faithfully execute whatever the LLM tells it to do.

The webhook will fire, the API will get called, the code will get committed, the database will get modified.

The pipes don’t exercise judgment, they just carry water wherever the brain directs them.

The [security researchers who’ve analyzed MCP have already flagged issues including prompt injection, overly broad tool permissions, and lookalike tools that can silently replace trusted ones.

These are plumbing vulnerabilities, weaknesses in the connections rather than in the reasoning, and they’re exactly the kind of thing that web developers have been hardening against for years in other contexts.

For production systems that matter, I still want a human reviewing the LLM’s decisions before the plumbing executes them.

The reasoning layer is impressive and getting better fast, but “impressive” and “trustworthy enough to run unsupervised on production infrastructure” remain two very different things.

What This Means for Builders

If you’re an indie builder or on a small team, the practical implications of all of this are surprisingly encouraging.

The plumbing is getting cheap and standardized.

MCP means connecting an LLM to your tools is becoming trivial. A2A means agents can collaborate across frameworks without custom integration work.

The orchestration frameworks like LangGraph, CrewAI, and AutoGen are maturing fast and abstracts away much of the complexity.

This means the ability to wire up an agent that monitors your error logs, drafts responses to support tickets, or generates weekly reports from your analytics data is within reach for a solo developer with a weekend and some API keys. You don’t need a platform team or an enterprise orchestration layer.

You need a cron job, an LLM API call, a few MCP connections, and a basic understanding of how webhooks work.

But the builder who understands the plumbing has a real advantage over the one who treats agents as magic.

Because when the agent does something unexpected (and it will), you need to know whether the problem is in the reasoning layer (the LLM made a bad decision), the integration layer (an API returned unexpected data), or the orchestration layer (the trigger fired at the wrong time or with the wrong context).

Treating the whole system as a black box makes debugging impossible and makes trust impossible, and those two things are related.

The Plumbing Was Always There

Demystifying the architecture doesn’t diminish what agents can do.

The LLM reasoning layer is genuinely remarkable, and the pace at which it’s improving is hard to overstate.

But wrapping it in language that makes the whole system sound like science fiction does builders a disservice.

The people who will build the best agent systems over the next few years are the ones who understand that they’re building a very clever decision-making layer on top of infrastructure that’s been running the internet for decades.

The plumbing was always there.

We just got a much smarter brain to sit on top of it, and knowing where the brain ends and the pipes begin is the difference between building something you can trust and building something that just looks impressive in a demo.