OpenAI Just Removed MCP From Its Biggest Agent Spec, Here's What That Actually Means

OpenAI's Symphony spec hit 15,000 GitHub stars in three weeks and explicitly removed MCP as a dependency. A week after we argued MCP was the new SaaS launch motion, the most-starred agent spec of the month skipped it on purpose. That's not a contradiction. It's a clarification.

Share
Outer loop, inner loop. Orchestration meets capability. Image created using vanikya.ai.
Dark blue editorial illustration of two glowing concentric loops in space, the outer cyan with task icons, the inner amber with tools.

A week ago we argued that MCP had quietly become the default SaaS launch motion. Anthropic shipped nine creative connectors, four unrelated SaaS vendors shipped MCP servers in the same window, and the protocol question seemed settled.

Then on April 27, OpenAI published Symphony.

It's an open-source spec that turns Linear into a control plane for Codex agents. It hit 15,000 GitHub stars in three weeks. Internal teams report a 500 percent increase in landed pull requests. And buried in the announcement is one quietly explosive sentence: "we removed a lot of incidental complexity, like dependencies on specific repositories or Linear MCP."

A week after MCP became the SaaS launch motion, the most-starred agent spec of the month skipped it on purpose.

That's not a contradiction. It's a clarification, and the cleanest one we've gotten yet on where MCP actually fits in the agent stack. Here's what Symphony is, why OpenAI removed MCP, and what builders should actually do with the answer.

What Symphony Actually Is

Most coverage has framed Symphony as "an open-source coding agent." That's not quite right. Symphony is an open-source specification for orchestrating coding agents, with a reference implementation in Elixir. The spec is the product. The Elixir code is the demo.

The architecture is simple enough to fit in one paragraph. Symphony polls your Linear board on a fixed cadence. For every open ticket, it spawns a dedicated workspace, starts a Codex session inside that workspace, and lets the agent run continuously until it produces a pull request. Tickets act as a state machine. If an agent crashes, Symphony restarts it. If a new ticket appears, Symphony picks it up. If an agent finishes, the PR moves to a Human Review state defined in your team's WORKFLOW.md file.

The mental shift is the actual product. Most existing agent tools (LangGraph, CrewAI, AutoGen, even Claude Code and Cursor) manage sessions. You open a tab, prompt the agent, watch it work, prompt again. Symphony manages work. Tickets in, PRs out. The sessions are an implementation detail you stop seeing.

The numbers OpenAI reports are loud: a 500 percent increase in landed PRs on some teams in the first three weeks of internal use. One engineer reportedly shipped three significant code changes from his Linear mobile app while sitting in a cabin with weak Wi-Fi. The contrast point that matters commercially: Cognition's Devin charges $500 per seat per month plus usage fees for roughly the same workflow. Symphony is Apache 2.0.

The Decision That Made Everyone Look Twice

Symphony's first internal version did use MCP. Specifically, Linear MCP, Linear's own MCP server, the kind of integration we spent last week's blog post celebrating.

OpenAI removed it.

The replacement architecture is worth understanding because it's a real engineering argument, not a religious one. Symphony runs on top of OpenAI's Codex App Server, a headless mode for Codex that exposes a JSON-RPC API. To give agents access to Linear without exposing the Linear access token to subagent containers, OpenAI used dynamic tool calls to expose a raw linear_graphql function that executes arbitrary GraphQL requests against Linear's API.

In plain English: instead of running an MCP server (which would mean every spawned subagent container has access to the access token, or you're proxying every call through an additional layer), Symphony's outer process holds the credential and exposes a single function the agent can call. The agent gets full Linear API surface. The token never leaves the orchestrator's process. The subagents stay sandboxed.

This is a security-architecture decision, not an anti-MCP statement. OpenAI's blog post is explicit about the tradeoff: dynamic tool calls give them tighter credential isolation than running Linear MCP inside the agent container. For a system designed to spawn dozens of parallel agent processes, that isolation matters more than MCP's portability benefits.

The signal worth catching is the reasoning. MCP is great when you want a tool surface portable across LLM clients. It's the wrong choice when your priority is credential isolation in a high-fanout multi-agent system. That's a useful distinction, and it's the first time a major lab has articulated it publicly.

The Outer Loop and the Inner Loop

The mental model that resolves the apparent contradiction with last week's MCP post showed up first as a Hacker News comment, then got picked up by analysts: Symphony is the outer loop. MCP is the inner loop.

The outer loop is the orchestration layer. It answers questions like: which ticket am I working on, what's the workspace, when do I retry, when do I hand off to a human, how do I monitor CI? Symphony lives here. So do alternatives like Composio's Agent Orchestrator, T3 Code, Cmux, and Cognition's Devin. They're all making different bets about what the right outer loop looks like.

The inner loop is what happens inside the workspace once the agent is running. The agent needs to read code, run tests, generate images, query a database, edit a Figma file, post to Slack. These are tool calls. MCP is the protocol that standardizes those tool calls across LLM clients. MCP lives here.

Once you see the split, both stories from the past two weeks make sense. The SaaS MCP wave (Adobe, Blender, Affinity, Trimble, Comply, Demandbase) is vendors racing to be inner-loop tools that any agent can call. Symphony is OpenAI shipping an opinionated outer loop. They're not competing. They're stacking.

The interesting question for any team building in this space is which loop you're targeting and whether you understand the tradeoffs at each layer.

Software as a Spec

The deeper signal in the Symphony launch isn't the orchestrator. It's how OpenAI built it.

The Elixir reference implementation was generated by Codex in one shot from the SPEC.md file. To stress-test the spec, OpenAI asked Codex to implement Symphony in TypeScript, Go, Rust, Java, and Python, and used the differences between the implementations to find ambiguities and tighten the spec. The languages were tools for refining the specification. The specification was the artifact.

OpenAI engineer Zach Brock framed it this way on X: "Instead of code, Symphony is first a Spec.md that you can materialize into any programming language you want by passing it to your coding agent of choice." He called it "software as a spec" and described it as "an early demonstration of a new way I expect open-source software to be developed and shared in the future."

If that pattern holds, it's a bigger long-term shift than Symphony itself. Open-source today means publishing code. Open-source as Brock describes it means publishing the spec, with code as an artifact your tooling can regenerate. The implications for forks, security audits, and language portability are significant, and they line up with the broader move toward agent-friendly repositories that OpenAI laid out in their earlier "harness engineering" post.

The choice of Elixir is itself a signal worth pausing on. OpenAI picked it specifically for BEAM's process supervision and fault tolerance. Elixir is a niche language most teams would never reach for. Their stated reasoning: "when code is effectively free, you can finally pick languages for their strengths." That's a more philosophical statement than the Symphony launch itself. When generation is cheap, language choice stops being constrained by team familiarity and starts being constrained by runtime characteristics. That's a different world for engineering hiring, codebase composition, and long-term maintenance economics.

The 500 Percent Number Deserves Skepticism

OpenAI reports that some internal teams saw a 500 percent increase in landed pull requests in the first three weeks of using Symphony. That number is doing a lot of work in the coverage. It deserves scrutiny.

Sanchit Vir Gogia, chief analyst and CEO at Greyhound Research, put the caution well in InfoWorld: "Generation scales effortlessly, validation does not. As output volume rises, the burden of review, testing, and governance rises with it." A 5x increase in landed PRs is only a productivity win if review quality, defect rates, and downstream rework hold steady. Without baseline data, the 500 percent figure is directional, not definitive.

The list of things builders should track is longer than landed PRs: peer-review friction, downstream rework, escaped defects, post-deployment incidents, recovery time, and the impact on junior engineers learning to code in an environment where most code arrives pre-written. None of these show up in PR counts.

Forrester analyst Biswajeet Mahapatra raised the related concern of governance: "Enterprises struggle with enforcing consistent security policies, auditability, and risk controls across distributed agents, especially when orchestration is decoupled from existing SDLC and identity systems." If Symphony or a Symphony-like system becomes your default work pipeline, your audit story has to evolve to match.

None of this disqualifies Symphony. It does mean the headline number is a starting point for a measurement strategy, not a finished argument.

The Competitive Landscape Got Crowded Fast

Symphony arrived in a category that already had real competitors. The most useful framing comes from a comparison Composio published in March:

OpenAI Symphony. Linear and Codex officially. Elixir reference implementation. Per-state concurrency limits. WORKFLOW.md prompt versioning in your repo. Review rework is destructive (full reset).

Composio Agent Orchestrator (AO). GitHub, GitLab, or Linear issue trackers. Spawns an agent in an isolated worktree, opens a PR, auto-fixes CI failures, routes review comments back. Node-based stack.

T3 Code (Theo Browne). Desktop app. Per-edit approval gates. Wraps Codex with Claude Code adapter in progress. Designed for focused 1-on-1 work.

Cmux (Manaflow). Native macOS terminal for AI agents. Split panes, scriptable browser. Not really an orchestrator, more the place where you run them.

Cognition Devin. The category pioneer. Closed source, $500 per seat per month plus usage. Symphony is the open-source pressure on Devin's pricing.

The community has already moved Symphony beyond OpenAI-only. The v1.1.0 release added Kata CLI support, opening the door to running Claude Code, Gemini, and other models inside the Symphony orchestration framework. The spec itself is model-agnostic. The Elixir reference happens to use Codex App Server, but as the spec hardens and other agent runtimes adopt similar app-server patterns, Symphony is on track to become a model-agnostic outer loop standard rather than an OpenAI walled garden.

This matters because it means you can adopt Symphony's pattern without committing to Codex. Or you can take the SPEC.md, point your favorite coding agent at it, and have a tailored implementation generated for your stack. That's the "software as a spec" thesis in action.

What This Means If You're Building MCP Tools

Here's where the two-week story comes together for builders who care about MCP.

If Symphony-like outer loops become the default pattern, your MCP server isn't competing with the orchestrator. It's running inside every workspace the orchestrator spawns. Every Linear ticket Symphony picks up generates a workspace where Codex (or Claude Code, or Gemini) needs tools. Your MCP server is one of those tools.

That's a more valuable position than competing as a destination. It's also a more demanding one. An MCP tool that gets called by 50 parallel agent workspaces a day is a different operational profile than one that gets called by humans clicking buttons. You need observability, rate-limiting, idempotency, and authorization models that survive being invoked by agents that don't read warning labels.

For Vanikya, this is the bet we've been making. Our MCP server at vanikya.ai/mcp brings creative generation (image, vector and SVG, Lottie animations, and video) directly into Claude as native tools. In a Symphony world, that capability becomes part of every agent workspace that needs to generate creative assets for a ticket. A marketing automation ticket needs a hero image. A product ticket needs a UI mockup. A documentation ticket needs an animated diagram. The agent calls Vanikya. The agent ships the PR. The human reviews.

Symphony solves the orchestration problem. MCP solves the capability problem. Creative MCP solves the "the agent needs visual assets" problem. The stack is starting to look complete.

What Founders Should Do on Monday

Six concrete moves, in order of urgency.

One. If you ship developer tools, evaluate Symphony or one of its alternatives (AO, T3 Code, Cmux, Devin) this week. Pick the one closest to your team's stack and run a 30-day pilot on a real backlog. The 500 percent number is suspect, but the workflow shift is real and the option value of not knowing is shrinking.

Two. Audit your codebase against OpenAI's "harness engineering" prerequisites. Symphony only works in repos that have invested in automated tests, agent-friendly structure, and guardrails. If your test coverage and CI quality aren't there, Symphony's PR throughput will be misleading at best and dangerous at worst.

Three. If you ship an MCP server, decide whether your authorization model survives high-fanout agent traffic. Vanikya, Affinity, Trimble, and Comply all chose hosted servers with explicit credential isolation. The Symphony team chose dynamic tool calls over MCP partly because of credential exposure concerns in container fanout. Make sure your auth story is built for the agent traffic pattern, not the human one.

Four. Define the outer loop and inner loop split for your product explicitly. Are you building an orchestrator, a tool, or both? The teams that articulate this clearly will move faster than the ones treating "agent strategy" as one undifferentiated concept.

Five. Track the right metrics. Landed PRs is a vanity metric. Defect rate, review friction, and time-to-recovery are the real ones. If you can't measure those today, build the dashboard before you adopt the orchestrator.

Six. Watch the "software as a spec" pattern. If OpenAI is right that this is how open-source evolves, your team's specs (API specs, design specs, workflow specs) become as valuable as your code. Start treating them that way now.

The Frame

Two weeks ago the question was whether MCP would matter. That question closed.

This week the question is sharper: where in the agent stack does each piece go? Symphony gave us the cleanest answer yet by removing MCP from the layer where it didn't fit and articulating why. Outer loop and inner loop. Orchestration and capability. Work and tools.

If you're building agents in 2026, that's the map. The teams that internalize the split will compose the right stack. The teams that flatten it will keep shipping confused architectures.

The protocol war is over. The stack war is just starting.


Vanikya's creative MCP brings image, vector and SVG, Lottie, and video generation into Claude as native tools: the inner-loop capability that orchestrators like Symphony will need every workspace to have. Connect at vanikya.ai/mcp.