OPEN STANDARD

Agent Threat Rules (ATR)

The first open detection standard for AI agent threats. Machine-readable, community-driven, and designed for AI agent security.

Contribute on GitHub Get Started

Standardization

OASIS Open Project proposal in preparation

Proposal-stage scaffolding now public: 9-seat TSC charter, OpenTelemetry-compatible event format, conformance corpus with threshold Ed25519 signing, DCO contribution model, and reference implementation interface contracts in TypeScript, Python, and Go. All marked PROPOSED, not yet ratified. Existing ATR rule format and engine API are unchanged.

Full status matrix

THE PROBLEM

AI agents need purpose-built detection

Traditional security rules were designed for network packets and file hashes. They cannot understand prompt flows, tool calls, or multi-turn agent conversations.

AI agents introduce a new attack surface: prompt injection, tool poisoning, context exfiltration, skill compromise. These threats live in the semantic layer -- invisible to legacy detection tools.

ATR is the detection standard built for the AI agent era.

Traditional Rules

Log-based IOCs. No awareness of prompt context or tool interactions.

File Scanners

File-level byte patterns. Cannot inspect agent conversation flows.

ATR Rules

Semantic-layer detection. Built for prompts, tools, and agent behavior.

WHY ATR

Three standards. Three eras.

ATR fills the gap that traditional detection tools leave open for AI agent threats.

Capability

Traditional Rules

File Scanners

ATR

Detection target

Log events

Files / memory

Agent behavior

Prompt injection detection

Tool call monitoring

Multi-turn conversation analysis

Semantic-layer matching

OWASP Agentic Top 10 mapping

MITRE ATT&CK / ATLAS references

Machine-readable format

Automated response actions

Community rule repository

RULE CATEGORIES

10 categories. 419 rules. 920+ patterns.

Covering the full AI agent attack surface, mapped to OWASP Agentic Top 10 (10/10) and MITRE ATLAS.

115 rules

Prompt Injection

Direct and indirect injection, jailbreaks, system prompt override, multi-turn attacks, encoding evasion, CJK social engineering

22 rules

Tool Poisoning

Malicious MCP responses, tool output injection, unauthorized tool calls, SSRF via tools, response piggyback

33 rules

Context Exfiltration

System prompt leaks, API key exposure, credential theft, SSH key access, environment variable harvesting

105 rules

Agent Manipulation

Cross-agent attacks, goal hijacking, inter-agent message spoofing, human trust exploitation, persona hijacking

11 rules

Privilege Escalation

Tool permission escalation, scope creep, admin function access, cross-agent privilege escalation

6 rules

Excessive Autonomy

Runaway agent loops, resource exhaustion, cascading pipeline failures

40 rules

Skill Compromise

Supply chain poisoning, skill impersonation, hidden capabilities, chain attacks, description-behavior mismatch, rug pull, squatting

1 rule

Data Poisoning

RAG retrieval poisoning, knowledge base contamination

3 rules

Model Security

Model behavior extraction, malicious fine-tuning data detection

8 rules

Model Abuse

Adversarial prompting against model safeguards, jailbreak corpora, model behaviour extraction

INTEGRATION

Where ATR fits in the stack

ATR rules are evaluated at the semantic layer -- between the LLM and the tools it invokes.

User Input

Prompt text, uploaded files, conversation context

ATR Engine

419 rules evaluated in <1ms per event. Block, alert, or escalate.

LLM / Agent

Claude, GPT, Gemini, local models -- any provider

Tools & Skills

MCP servers, OpenClaw skills, file system, shell, APIs

ATR intercepts at the semantic layer -- before malicious instructions reach the agent, and before compromised outputs reach the tools.

HOW IT WORKS

YAML rules. Real-time engine.

Write human-readable rules. The ATR engine matches them against live agent telemetry in milliseconds.

Define detection logic

Each rule specifies conditions on agent fields: user_input, tool_calls, model_output, context. Supports regex, keyword, and semantic operators.

Map to frameworks

Rules link to OWASP LLM Top 10 and MITRE ATLAS references, providing compliance coverage and threat context.

Engine evaluates in real-time

The ATR engine loads rules and matches them against agent events as they occur. Sub-millisecond evaluation per rule.

Automated response

When a rule triggers, configurable actions fire: block_input, alert, snapshot, escalate. Threshold-based auto-response prevents false positive fatigue.

Example: Direct Prompt Injection Detection

title: "Direct Prompt Injection via User Input"
id: ATR-2026-001
status: experimental
severity: high

references:
  owasp_llm:
    - "LLM01:2025 - Prompt Injection"

detection:
  conditions:
    - field: user_input
      operator: regex
      value: "(?i)(ignore|disregard)\\s+previous\\s+instructions"
  condition: any

response:
  actions:
    - block_input
    - alert
    - snapshot

RULE EXAMPLES

Rules for real threats

Each rule targets a specific attack pattern observed in production AI agent deployments.

ATR-2026-008

critical

Tool Poisoning via MCP

Tool Poisoning

title: "Direct Prompt Injection via User Input"
id: ATR-2026-001
status: experimental
severity: high

references:
  owasp_llm:
    - "LLM01:2025 - Prompt Injection"

detection:
  conditions:
    - field: user_input
      operator: regex
      value: "(?i)(ignore|disregard)\\s+previous\\s+instructions"
  condition: any

response:
  actions:
    - block_input
    - alert
    - snapshot

ATR-2026-012

high

Context Exfiltration via Markdown

Context Exfiltration

title: "Tool Poisoning via MCP Response"
id: ATR-2026-008
status: experimental
severity: critical

references:
  owasp_llm:
    - "LLM02:2025 - Tool Misuse"

detection:
  conditions:
    - field: tool_output
      operator: regex
      value: "(eval|exec|child_process|__import__|subprocess\\.run)\\("
    - field: tool_output
      operator: contains
      value: "import os"
  condition: any

response:
  actions:
    - block_output
    - alert
    - block_tool

ATR-2026-020

medium

Excessive Agent Autonomy Loop

Excessive Autonomy

title: "Context Exfiltration via Markdown"
id: ATR-2026-012
status: experimental
severity: high

detection:
  conditions:
    - field: model_output
      operator: regex
      value: "!\\[.*\\]\\(https?://[^)]+\\?.*="
    - field: model_output
      operator: regex
      value: "(api_key|secret|token|password|credential)"
  condition: all

response:
  actions:
    - block_output
    - alert
    - snapshot

COMPLIANCE MAPPING

OWASP Agentic Top 10 coverage

Every ATR rule maps to the OWASP Top 10 for Agentic Applications, providing structured coverage of the most critical AI agent security risks.

ASI01Agent Goal Hijack

ASI02Tool Misuse & Exploitation

ASI03Identity & Privilege Abuse

ASI04Agentic Supply Chain Vulnerabilities

ASI05Unexpected Code Execution

ASI06Memory & Context Poisoning

ASI07Insecure Inter-Agent Communication

ASI08Cascading Failures

ASI09Human-Agent Trust Exploitation

ASI10Rogue Agents

ECOSYSTEM

Open standard. Community-driven growth.

ATR follows the proven playbook of open standards -- open governance, community contributions, and vendor-neutral design.

419

Detection rules

770

Detection patterns

10/10

OWASP Agentic coverage

100%

SKILL.md recall

CONTRIBUTION FLOW

Identify a threat pattern

Observe a new attack vector in production, research, or CTF. Document the behavior.

Write an ATR rule

Define detection conditions in YAML. Map to OWASP and MITRE references. Add test cases.

Submit a pull request

The community reviews, tests, and merges. Rules ship to all ATR users automatically.

Collective defense

Every new rule strengthens the entire ecosystem. One contributor protects thousands of deployments.

ROADMAP

The standard evolves

v2.0Current

Open Standard

419 rules, 920+ patterns across 10 categories
RFC-001 v1.1 quality standard published
Maturity levels: draft / experimental / stable
Cisco AI Defense ships 34 ATR rules
OWASP Agentic Top 10: 10/10 coverage

v2.xNext

Collective Defense

Threat Cloud crystallization pipeline
GitHub Action for CI/CD scanning
Hermes Agent integration (76K stars)
RFC-002: Behavioral detection types
RFC-003: Collective defense protocol

v3.0

Enterprise Standard

RFC-004: Enterprise deployment guidance
EU AI Act compliance mapping
Private rule feeds for enterprises
Multi-agent fleet visibility
Vendor certification program

Join the ATR community

ATR is open source and community-driven. Contribute rules, report new threat patterns, or integrate ATR into your own agent security stack.

Contribute on GitHub Get Started

title: "Direct Prompt Injection via User Input" id: ATR-2026-001 status: experimental severity: high references: owasp_llm: - "LLM01:2025 - Prompt Injection" detection: conditions: - field: user_input operator: regex value: "(?i)(ignore|disregard)\\s+previous\\s+instructions" condition: any response: actions: - block_input - alert - snapshot

title: "Tool Poisoning via MCP Response" id: ATR-2026-008 status: experimental severity: critical references: owasp_llm: - "LLM02:2025 - Tool Misuse" detection: conditions: - field: tool_output operator: regex value: "(eval|exec|child_process|__import__|subprocess\\.run)\\(" - field: tool_output operator: contains value: "import os" condition: any response: actions: - block_output - alert - block_tool

title: "Context Exfiltration via Markdown" id: ATR-2026-012 status: experimental severity: high detection: conditions: - field: model_output operator: regex value: "!\\[.*\\]\\(https?://[^)]+\\?.*=" - field: model_output operator: regex value: "(api_key|secret|token|password|credential)" condition: all response: actions: - block_output - alert - snapshot