The Control Plane Just Collapsed. 40 Years of Security Assumptions Are Gone.
Traditional security separates control plane from data plane. AI agents merge them -- instructions and data are both text tokens. Every firewall, IDS, and WAF assumes this separation still exists. It does not.
The Assumption That Built Modern Security
Every security architecture built since the 1980s rests on one assumption: control and data travel on separate channels. In networking, the control plane (routing protocols, management traffic) is isolated from the data plane (user packets). In operating systems, kernel instructions are separated from user data via privilege rings. In web applications, SQL queries are parameterized to separate code from input. In every case, the principle is the same: the thing that gives orders must be physically or logically separated from the thing that carries cargo.
AI Agents Broke It
When an LLM processes a request, everything is tokens. The system prompt is tokens. The user message is tokens. The tool response is tokens. A piece of data returned from a web scraper occupies the same channel, the same format, and the same attention mechanism as the instruction telling the model what to do next. There is no separation. A carefully crafted string inside a CSV file, a web page, or an email body can rewrite the model behavior just as effectively as a system prompt change.
This is not a bug. It is the architecture. Transformers process all input through the same attention layers. There is no privileged instruction register. There is no kernel/user boundary. Everything is context, and context is influence.
Why Existing Tools Cannot Help
Firewalls filter packets based on headers and ports -- they cannot inspect whether a text token inside an HTTP response body will become an instruction. Intrusion detection systems match known attack signatures in network traffic -- they have no model for "this sentence will cause an LLM to ignore its system prompt." Web application firewalls sanitize SQL and XSS in HTTP parameters -- they do not understand that `<IMPORTANT>Ignore all previous instructions</IMPORTANT>` inside a JSON response is an attack.
These tools were built for a world where data cannot become code spontaneously. In the LLM context window, data becomes code constantly. Every external input -- every tool response, every file read, every API call result -- is a potential control plane injection.
The Proof Is in the Numbers
Research across 20 state-of-the-art LLMs shows an average tool-layer prompt injection success rate of 36.5%. The best-performing model against these attacks still fails 14.2% of the time. Roleplay-based attacks achieve 89.6% success rates against models that show only 4.7% vulnerability on static injection benchmarks. The gap between static benchmarks and real-world attack success is enormous because static benchmarks test the data plane in isolation. Real attacks exploit the collapsed control plane.
What a New Security Architecture Looks Like
If the control plane cannot be separated from the data plane inside the model, it must be separated outside the model. This means: 1. Pre-execution scanning: Every tool description, every skill file, every MCP manifest is scanned for injection patterns before the LLM ever sees it. ATR does this with 71 detection rules. 2. Runtime monitoring: Every tool invocation is logged and analyzed for anomalous behavior patterns -- unexpected file access, unauthorized network calls, privilege escalation. Guard does this. 3. Output validation: Every action the LLM proposes is validated against a policy before execution. The model can suggest; it cannot act unilaterally. 4. Threat intelligence: Attack patterns discovered on one machine are shared across the network. Threat Cloud does this.
The Uncomfortable Truth
We cannot fix this at the model layer. Anthropic, OpenAI, Google, and every other lab are working on alignment and instruction hierarchy. These efforts help. They do not solve the problem. As long as the architecture processes instructions and data through the same mechanism, the control plane is collapsed. The defense must be external.
This is why ATR exists. Not because we think regex can catch every attack -- it cannot. But because the security industry needs detection rules that operate outside the model, at the interface layer where skills meet agents. That is the new perimeter. And right now, almost nobody is defending it.