News

Anthropic Turns MCP Agents Into Code First Systems With ‘Code Execution With MCP’ Approach

November 8, 2025

Agents utilizing the Model Context Protocol (MCP) face significant scalability challenges. Since every tool definition and intermediate output must be processed through the model’s context window, extensive workflows quickly consume large numbers of tokens, leading to increased latency and escalating costs. Anthropic introduces an innovative solution that restructures this workflow by transforming MCP tools into code-level APIs, enabling the model to generate and execute code rather than invoking tools directly.

Challenges of Direct MCP Tool Integration

MCP is an open framework designed to enable AI systems to interface seamlessly with external services via MCP servers that expose various tools. These tools allow models to query databases, interact with APIs, or manipulate files through a standardized interface.

Traditionally, agents load multiple tool definitions into the model’s context, each containing detailed schema and metadata. Additionally, intermediate results from tool invocations are fed back into the context, allowing the model to determine subsequent actions.

For instance, consider an agent that retrieves a lengthy sales meeting transcript from a Google Drive MCP server and then updates a Salesforce record with that transcript via another MCP server. The entire transcript is passed through the model twice-once when fetched and again when sent to Salesforce-resulting in tens of thousands of redundant tokens that do not affect the task’s logic.

As the number of MCP servers and tools grows, this approach becomes unsustainable. The model must process extensive tool catalogs and shuttle large data payloads between tools, causing latency to spike, costs to soar, and context window limits to become a bottleneck.

Reimagining MCP Servers as Code APIs

Anthropic proposes embedding MCP within a code execution loop. Instead of direct tool calls, the MCP client exposes each server as a collection of code modules within a virtual filesystem. The model then writes TypeScript scripts that import and orchestrate these modules, executing the code in a secure, sandboxed environment.

This approach follows three key steps:

The MCP client generates a directory structure (e.g., servers) that mirrors the available MCP servers and their tools.
For each MCP tool, a lightweight wrapper function is created as a source file (such as servers/google-drive/getDocument.ts) that internally invokes the MCP tool with strongly typed parameters.
The model is tasked with authoring TypeScript code that imports these wrappers, manages control flow, and handles data processing within the execution environment.

Returning to the earlier example, the Google Drive and Salesforce workflow becomes a concise script. The script fetches the transcript once via the Google Drive wrapper, processes or inspects the data locally, and then calls the Salesforce wrapper. Crucially, the full transcript never passes through the model’s context-only summaries or status updates do.

Cloudflare’s ‘Code Mode’ on its Workers platform employs a similar strategy, converting MCP tools into TypeScript APIs and running model-generated code inside isolated environments with limited bindings.

Remarkable Reduction in Token Consumption

Anthropic shares a compelling case study: a workflow that originally consumed approximately 150,000 tokens when passing tools and intermediate data directly through the model was reengineered using code execution and filesystem-based MCP APIs. This redesign slashed token usage to around 2,000 tokens-a staggering 98.7% decrease-resulting in significant cost savings and faster response times.

Advantages for Developers Crafting Intelligent Agents

Integrating code execution with MCP offers multiple practical benefits for engineers building AI agents:

Incremental Tool Discovery: Agents no longer require all tool definitions upfront. They can dynamically explore the generated filesystem, enumerate available servers, and load specific tool modules on demand. This shifts the burden of tool catalogs from the model’s context to code, conserving tokens by focusing only on relevant interfaces.
Efficient Data Management: Large datasets remain within the execution environment. For example, TypeScript code can retrieve a massive spreadsheet via an MCP tool, filter rows, compute aggregates, and return only concise summaries or samples to the model. This approach ensures the model processes a distilled view of the data while heavy computations occur externally.
Enhanced Privacy Controls: Sensitive information such as emails or phone numbers can be tokenized within the execution environment. The model interacts with placeholders, while the MCP client securely maps and restores actual values when invoking downstream tools. This method enables data exchange between MCP servers without exposing raw identifiers to the model.
Persistent State and Modular Skills: The filesystem allows agents to save intermediate files and reusable scripts. For instance, a helper script that converts a spreadsheet into a report can be stored in a skills directory and imported in future sessions. This concept aligns with Anthropic’s vision of “Claude Skills,” where collections of scripts and metadata define advanced capabilities.

Conclusion: A Paradigm Shift in MCP Agent Design

Anthropic’s ‘code execution with MCP’ methodology represents a significant evolution for MCP-powered agents. By addressing the token overhead of loading tool definitions and routing bulky intermediate data through the model context, this approach transforms MCP servers into executable API surfaces. Offloading processing to a sandboxed TypeScript runtime not only boosts efficiency but also necessitates rigorous attention to code execution security. This innovation paves the way for more scalable, cost-effective, and privacy-conscious AI agents.