Threat Intelligence

The Email That Waited 6 Weeks to Steal Your Data

Adam LinMarch 22, 20269 min

ZombieAgent plants a false memory via email that activates weeks later. MINJA achieves 98.2% memory injection success. AiXBT lost $106,200 to a 2 AM prompt injection. Delayed attacks are the next frontier of AI agent security.

The Delayed Attack Model

Traditional prompt injection is immediate: inject an instruction, get a result. Delayed attacks are different. The attacker plants a payload that sits dormant in the agent memory or context for days or weeks, then activates when conditions are met. The time gap between injection and activation makes these attacks extremely difficult to detect because there is no suspicious activity at the time of injection.

ZombieAgent: Memory Poisoning

ZombieAgent is a research attack demonstrated against AI agents with persistent memory (RAG-based systems, agents with conversation history, agents that learn from interactions). The attack works in three phases: 1. Injection: An attacker sends a carefully crafted email to a target. The email contains benign-looking text with an embedded instruction designed to be stored in the agent memory. 2. Dormancy: The instruction sits in the agent RAG store for days or weeks. It does nothing. There is no anomalous behavior to detect. 3. Activation: When the user asks the agent a related question weeks later, the agent retrieves the poisoned memory and follows the embedded instruction -- typically exfiltrating data or modifying its behavior.

MINJA: 98.2% Memory Injection Success

MINJA (Memory INJection Attack) formalized this attack class with rigorous evaluation. The results are alarming: 98.2% memory injection success rate across tested RAG systems. Over 70% downstream attack success -- meaning that when the poisoned memory is retrieved, it successfully manipulates the agent behavior more than 70% of the time. The attack works because RAG systems optimize for retrieval relevance, not retrieval safety. A well-crafted injection scores high on semantic similarity to legitimate queries, ensuring it gets retrieved at exactly the right moment.

AiXBT: $106,200 Stolen at 2 AM

This one is not a research paper. It happened. AiXBT is an AI agent managing cryptocurrency trades. In a real-world attack, an adversary used prompt injection to manipulate the agent into executing unauthorized transactions at 2 AM, when human oversight was minimal. The agent transferred 55.5 ETH ($106,200 at the time) to an attacker-controlled wallet. The attack exploited a queued prompt injection -- the malicious instruction was placed in the agent task queue during business hours but designed to execute during off-hours when approval thresholds were lower.

Why This Changes the Threat Model

Security scanning that runs at install time catches immediate attacks. It does not catch delayed payloads planted in agent memory through normal interactions. Consider the attack surface: every email your AI agent reads, every document it summarizes, every web page it scrapes, every API response it processes -- any of these can contain a dormant instruction that activates later.

The defender has to be right at every retrieval. The attacker has to be right once, and they have weeks to refine the payload.

How ATR Addresses Delayed Attacks

ATR includes rules specifically designed for delayed attack patterns: - ATR-PI-007: Detects instruction-like content embedded in data fields (email bodies, document text, API responses) -- the injection phase - ATR-MA-001: Flags operations that write to agent memory stores or RAG databases from external sources -- the persistence phase - ATR-PI-013: Detects conditional activation patterns ("when the user asks about X, do Y") -- the trigger phase

These rules operate at the skill level, not the model level. They scan the tools and data pipelines that feed information into the agent context window. They cannot prevent every delayed attack -- a sufficiently sophisticated payload can evade regex-based detection. But they raise the cost of attack significantly and catch the commodity-level delayed injections that will become the most common attack class as AI agents proliferate.

Recommendations

If you run AI agents with persistent memory: 1. Audit your RAG pipeline: What data sources feed into agent memory? Each one is an injection surface. 2. Implement memory hygiene: Periodically scan stored memories for instruction-like patterns. 3. Separate data from instructions: Use structured formats that distinguish retrieved context from actionable instructions. 4. Monitor off-hours activity: The AiXBT attack deliberately targeted low-oversight periods. Set up alerts for high-impact actions outside business hours. 5. Scan with ATR: Run pga scan against your MCP configurations. ATR catches injection patterns in tool definitions and data pipelines.