Threat Intelligence

Four Hours. The New Disclosure-to-Exploit Window for AI Agent CVEs.

2026年5月18日7 min

PraisonAI was scanned for exploitation 3 hours 44 minutes after disclosure. Microsoft Copilot opened a regression test against ATR rules for the Semantic Kernel CVEs and we shipped the rules in 2 hours 16 minutes. Content-layer detection rules are the only thing operating in that timeframe.

本文中的數字（規則數、benchmark）以發文當日為準。最新數字請見 /research/benchmarks。

_PraisonAI was scanned for exploitation 3 hours 44 minutes after disclosure. Microsoft Copilot opened a regression test against ATR rules for the Semantic Kernel CVEs and we shipped the rules in 2 hours 16 minutes. Content-layer detection rules are the only thing operating in that timeframe._

The two data points

Point one: 3 hours 44 minutes

On 2026-05-16 Sysdig Threat Research published an exploitation-timeline writeup for CVE-2026-44338, an authentication bypass in PraisonAI (Sysdig blog). The GHSA was published as GHSA-6rmh-7xcm-cpxj. The root cause was a hardcoded AUTH_ENABLED=False default on the agent control plane — the Flask API was reachable on every public deployment without authentication.

Sysdig observed a CVE-Detector/1.0 scanner probing public-internet PraisonAI instances 3 hours 44 minutes after the disclosure. CSO Online picked it up the next day (coverage). The patch was in 4.6.34. Most operators were not at 4.6.34.

Point two: 2 hours 16 minutes

On 2026-05-07 MSRC published two CVEs against Microsoft Semantic Kernel: CVE-2026-26030 and CVE-2026-25592. They were prompt-injection vectors against the SK orchestrator.

On 2026-05-11 at 06:07 UTC, Microsoft Copilot SWE Agent (operating inside Microsoft Agent Governance Toolkit) opened AGT issue #1981, a regression-test issue presuming a coverage gap in Agent Threat Rules (ATR) for those two CVEs. The autonomous agent dropped four test fixtures and asked for matching rule IDs.

Agent Threat Rules v2.1.2 published to npm and GitHub at 08:24 UTC with two new rules — ATR-2026-00440 and ATR-2026-00441 — plus a redactMatchedValue helper for credential-leak paths. End-to-end 2 hours 16 minutes, no human routing on Microsoft's side. Two of the four Copilot regression fixtures matched the canonical regex shape; the other two were partial. We continue to track the partial-match gap in the rule's evasion_tests block.

Neither timeline is unique. Microsoft published "Configuration Becomes a Vulnerability" on 2026-05-14, documenting exploitable misconfigurations in publicly exposed AI services. Defender for Cloud telemetry shows the same compressed window. Every team running an AI agent control plane is operating inside a four-hour exploit cycle now.

What signature AV cannot do in four hours

Traditional AV vendors operate on multi-day signature-pack release cycles. Network IPS vendors lag CVE publication by 24-72 hours on average. SIEM rule packs ship monthly. Every one of those cycles is longer than the disclosure-to-exploit window for an AI agent CVE in 2026.

Content-layer detection rules are different. They live in source control. They ship as npm packages or via MISP feeds. They are loaded by the agent runtime or its sidecar gate without a vendor pipeline in the way. A maintainer who sees a CVE at 06:00 can have rules in production at 08:00 if the disclosure included a usable PoC. Microsoft Copilot demonstrated this end-to-end on 2026-05-11.

The Microsoft Copilot loop is the existence proof. It is also the limit. Without a public PoC the maintainer is guessing at the regex shape. Two of the four OpenClaw "Claw Chain" CVEs disclosed by Cyera Research on 2026-05-15 are content-layer-adjacent (CVE-2026-44115 unquoted-heredoc env expansion, CVE-2026-44118 spoofable senderIsOwner flag), but Cyera withheld the PoC payloads. The honest action is to track and hold the rule promotion until a PoC surfaces.

What the open-rule layer looks like in production

ATR ships as a single npm package. v2.2.1 has 419 rules across 10 categories. Independent benchmarks: 99.7 percent precision and 63.9 percent recall on PINT (850 samples), 97.1 percent recall on NVIDIA Garak, 100 percent precision and 89.7 percent recall on the 341-sample self-test corpus. Sub-millisecond per-rule latency.

The integrations are not proposals. They are merged:

●Microsoft Agent Governance Toolkit — PRs #908 and #1277, in production with a weekly auto-sync workflow that pulls ATR updates.
●Cisco AI Defense skill-scanner — PRs #79 and #99 merged, full 336-rule pack in production.
●MISP galaxy and taxonomies — merged 2026-05-10. EU national CERTs consume the rule-ID vocabulary directly.
●OWASP Agent Security Reference Hub — PR #74 merged 2026-05-11.
●Gen Digital Sage (Norton / Avast parent) — PR #33 merged.
●Five Eyes joint guidance on AI system security referenced ATR on 2026-05-01.

10,046 npm + PyPI downloads in the trailing thirty days as of 2026-05-26, summed across @panguard-ai/* (7,190) + agent-threat-rules (2,654) + pyatr (202).

Honest scope

ATR is a content-layer detection rule library. It runs against prompts, tool-call arguments, tool responses, and skill manifests. It does not run against the operating system, the network stack, the container runtime, or the file system. The two OpenClaw TOCTOU sandbox-escape CVEs (CVE-2026-44112, CVE-2026-44113) are runtime concerns and belong with Falco, Velociraptor, or eBPF probes — not with ATR rules.

The four-hour window question is not "which detection layer is fastest." It is "is there a detection layer that can ship in the same timeframe the attacker is operating in." For content-layer agent threats the answer is now demonstrably yes. For runtime sandbox escapes, that layer is Falco. For supply-chain compromise, that layer is Sigstore + npm provenance. Each one operates on its own cadence. The agent operator needs all of them.

What we will not do is pretend ATR catches things it does not catch. A rule library that overclaims its scope loses contributor trust within one quarter. The detection-rule layer earned its position by being honest about boundaries.

What teams running agent control planes should do this week

If you operate an agent control plane reachable from the internet, three concrete actions, in order:

First, verify your agent runtime accepts content-layer rule input. If you run Microsoft AGT, Cisco AI Defense, or any MISP-fed pipeline, you already do. If you run a roll-your-own loader, point it at the ATR npm package and add a single function call at the boundary between resolved tool config and execution.

Second, subscribe to GHSA notifications for the AI agent packages you depend on. Not because you will read every advisory the moment it lands — you will not — but because the four-hour window is real and your runbook needs to assume the disclosure could be the start of an active exploit, not the start of a patching window.

Third, contribute. The open-rule layer scales linearly with the number of contributors writing rules from real PoCs. If you see a CVE land for an AI agent framework you use, write the rule. The MIT license means there is no procurement conversation in the way. PRs land in hours, not weeks.

What we will publish next

Two follow-ups are queued:

●The five-step PoC-to-rule path. Concrete worked example from CVE-2026-44338 (PraisonAI) and CVE-2026-26030 (Semantic Kernel), with the actual diff and test fixtures.
●The eight evasion patterns that show up in more than 90 percent of the 751 malware samples we identified across the agent-skill marketplace. Regex shapes, why LLM-as-judge filters miss them, and the rule IDs that catch each.

Both will land before the end of May.

The detection layer is finally moving as fast as the threat. Now it has to stay there.

References

Sysdig — CVE-2026-44338 PraisonAI Authentication Bypass in Under 4 Hours · CSO Online coverage · Microsoft — Configuration Becomes a Vulnerability · Cyera Research — Claw Chain · Agent Threat Rules repo