MCP in Production: Power, Peril, and 138 CVEs
MCP is reshaping multi-agent architecture fast. But OpenClaw's 138 CVEs prove the stack is cracking. Where does your real exposure live?
Summary
The agentic AI stack is maturing fast and cracking under pressure simultaneously. MCP is becoming the connective tissue of production agent systems while OpenClaw's 138 CVEs expose what happens when you deploy that stack without treating security as a first-class concern. This piece maps the technical terrain from medical dosimetry to telecom to hardware infrastructure, and tells you where the real exposure lives.
The MCP Moment Is Real, and So Is the Attack Surface
Model Context Protocol has quietly become the default plumbing for multi-agent systems. Two weeks ago it was a spec. Today it is the integration layer inside DosimeTron, a production agent system automating Monte Carlo radiation dosimetry on PET/CT scans using GPT-5.2 as its reasoning engine across 23 tools distributed over four MCP servers. It is also the integration layer inside Gemini Enterprise deployments scaling past prototype stage. And it is, apparently, the integration layer inside OpenClaw, which has accumulated 138 known CVEs including critical exploits that allow full system control.
That last fact deserves to sit in your working memory before you read anything else here.
MCP Is a Trust Boundary, Not Just a Protocol
The architectural appeal of MCP is real. It gives you a standardized way to expose tools to a reasoning model, manage context windows across server boundaries, and compose complex capability graphs without writing custom glue code for every integration. DosimeTron's architecture demonstrates this concretely: DICOM metadata extraction, image preprocessing, and dosimetric reporting are handled as discrete tool calls across a structured server topology, not as monolithic pipeline code. The system achieves Pearson's r between 0.965 and 1.000 across organ-level dosimetric accuracy, with mean absolute percentage difference below 5% for 19 of 22 organs. These are peer-reviewed numbers on a public PSMA-PET/CT dataset of 597 studies. When MCP-based architecture works, it works.
But MCP servers are also trust boundaries. Every tool endpoint is a potential injection surface. Every server-to-server handoff is a place where a compromised context payload can propagate laterally. The OpenClaw vulnerability count is not a coincidence. It is what happens when teams treat MCP as a convenience layer and skip the access control, input validation, and sandboxing work that the protocol does not provide for you.
What OpenClaw Tells Us About the Entire Agentic Stack
138 CVEs on a single agentic platform is not a patch management problem. It is a signal about how the category was built. The agentic AI ecosystem grew faster than its security posture. Frameworks optimized for developer velocity and demo-ability, not for the threat model that emerges when an LLM has tool access to filesystems, APIs, and internal services.
The Attack Model Is Different From Traditional Software
In a traditional application, the attack surface is bounded by the inputs your code explicitly handles. In an agentic system with MCP, the attack surface includes the model's interpretation of natural language tool descriptions, the content of documents retrieved into context, the outputs of upstream tool calls that become inputs to downstream ones, and the reasoning chain that determines what gets executed next.
Prompt injection through retrieved content is the canonical example: an attacker embeds instructions in a document that the agent retrieves and processes, causing the model to treat adversarial text as legitimate directives. This is not a hypothetical. It has been demonstrated against multiple deployed systems. In a multi-agent architecture with MCP, a successful injection at one server can propagate instructions to other servers in the graph because the contaminated reasoning output becomes a trusted input to the next tool call.
Your Attack Surface Is Now the Model's Mind
This means hardening an MCP-based system requires more than patching OpenClaw CVEs. It requires:
Input Validation at Every Tool Boundary
Treat every tool input as untrusted, even when it comes from another agent in your own system. Validate schema and reject payloads that contain instruction-like natural language in structured fields.
Sandboxed Tool Execution
MCP servers that touch filesystems, execute code, or make outbound network calls should run in isolated environments with minimal permissions. The principle of least privilege applies to agent tools the same way it applies to service accounts.
Context Provenance Tracking
Log not just what was executed but what context payload triggered execution. When an exploit occurs, you need to reconstruct the reasoning chain, not just the API call log.
Human-in-the-Loop Gates for Irreversible Actions
Any tool call that writes, deletes, or exfiltrates data should require explicit confirmation before execution. This is not UX friction. It is the difference between a contained incident and a full compromise.
The Gemini Enterprise Transition Validates the Risk Model
The framing that "the chat-only LLM wrapper era is ending" is accurate and understated. Gemini Enterprise positioning MCP as the scaling path from prototype to production means the protocol is being pushed into enterprise environments with existing security review processes, compliance requirements, and blast radius concerns that consumer-grade deployments never had to care about.
If you are scaling an agentic system on Gemini Enterprise with MCP today, your security review needs to include the MCP server topology explicitly. Not the LLM. The servers. The tools. The trust relationships between them.
The vulnerability is not in the model. It is in the assumption that tool calls are safe because the model decided to make them.
The Hardware Layer Is Catching Up to the Inference Demands
While the security conversation is urgent, the infrastructure conversation is also moving. Intel and SambaNova have announced a heterogeneous compute architecture that splits the inference workload across Intel Xeon 6 processors, GPUs, and SambaNova RDUs. The claimed division of labor: Xeon 6 handles host and action CPU tasks, GPUs handle prefill, RDUs handle decode.
This is architecturally coherent for agentic workloads specifically. Agentic inference patterns are not uniform. Prefill is compute-intensive and parallelizable. Decode is memory-bandwidth-bound and benefits from specialized silicon. Action execution is latency-sensitive but not compute-heavy. The heterogeneous split maps onto these different phases reasonably well.
Bold Claims Deserve Harder Questions
That said, Intel and SambaNova are making this claim in the context of a partnership announcement, not a published benchmark. Faster than what? Under which agentic workload configurations? Measured how? Browser-agent and computer-use scenarios have highly variable inference patterns that depend heavily on tool latency, not just model throughput. Until there are reproducible numbers on realistic agentic task distributions, treat the performance positioning as directionally interesting and empirically unverified.
SoundHound in Telecom Is a Deployment Pattern, Not a Breakthrough
The SoundHound and Associated Carrier Group partnership follows a well-established playbook: take a voice AI platform with existing telephony integrations and position it as agentic customer service. The "agentic" framing here means the system can handle multi-turn service interactions with some degree of goal-directed behavior, not that it is running complex plan-and-execute loops over enterprise tool graphs.
This matters because the security and reliability requirements for a telecom customer service agent are categorically different from what DosimeTron or a Gemini Enterprise deployment requires. Conflating them under the same "agentic AI" label obscures the engineering work.
What To Do With This Right Now
If you are running any MCP-based system in production, the OpenClaw CVE count should trigger an audit of your own server configurations, even if you are not using OpenClaw directly. The vulnerability classes that accumulate on one platform tend to exist in similar form across the ecosystem because the underlying design patterns are shared.
If you are evaluating Gemini Enterprise for agentic scale-up, push your vendor contact for the MCP security documentation before you push for the Gemini 2.5 Pro integration. The model capability question is mostly answered. The tool security question is not.
DosimeTron's Variance Reveals Your Pipeline's Hidden Risk
If you are building with DosimeTron's architecture as a reference, the 32.3-minute per-study processing time with a 6.0-minute standard deviation tells you something important: variance in agentic pipeline execution is real and needs to be budgeted for in SLA design, not smoothed over in demo conditions.
The Bottom Line
- OpenClaw's 138 CVEs are a category warning, not just a product problem: audit your MCP server topology now
- Every MCP tool boundary is a trust boundary that requires explicit input validation and sandboxed execution
- DosimeTron is the clearest published evidence that agentic AI with MCP can reach production-grade accuracy when built rigorously
- Intel/SambaNova's heterogeneous inference architecture is architecturally coherent but empirically unverified for agentic workloads
- The security review for your agentic system needs to cover tools and context propagation, not just the model
Sources: Medium: AI Agents (April 10, 2026), Medium: Agentic AI (April 10, 2026), ArXiv CS.AI (April 10, 2026), NewsAPI (April 9, 2026)