Microsoft Published Semantic Kernel RCE on May 7. Microsoft Copilot Wrote Test Fixtures Against Our Rules on May 11. We Shipped the Detection Same Day.
On 2026-05-07 Microsoft Security disclosed two critical RCE CVEs in Semantic Kernel — CVE-2026-26030 (In-Memory Vector Store lambda+eval injection) and CVE-2026-25592 (SessionsPythonPlugin file write to autostart paths). On 2026-05-11 06:07 UTC, Microsoft Copilot SWE Agent opened agent-governance-toolkit#1981 adding regression test fixtures against the ATR community-rules pack, presuming ATR would have detection. Eight hours later, ATR v2.1.2 shipped on npm with rules ATR-2026-00440 and ATR-2026-00441 covering both CVEs — plus a credential-leak redact helper for any integration that logs ATRMatch.matchedPatterns. This is the layer-0 detection-standard flywheel running end-to-end in a single workday.
The trigger event
On 2026-05-07, Microsoft Security Response Center published *When prompts become shells: RCE vulnerabilities in AI agent frameworks*. Two CVEs in Microsoft Semantic Kernel:
- ●CVE-2026-26030 — In-Memory Vector Store filter functions interpolate a user-controlled filter expression and evaluate it as a lambda. Attacker traverses Python class hierarchy via
tuple().__class__.__mro__to reachBuiltinImporterand executeos.system(). Unauthenticated RCE on the SK host. - ●CVE-2026-25592 — SessionsPythonPlugin
DownloadFileAsyncaccepts unvalidated paths. Combined with the plugin'sExecuteCodecapability, attacker writes payloads to host Startup / cron / systemd-user paths and gains persistence outside the sandbox. Full kill chain:ExecuteCode → generate payload → DownloadFileAsync → autostart write → reboot triggers RCE.
Microsoft's mitigation: upgrade semantic-kernel Python to ≥1.39.4 or .NET to ≥1.71.0. The patch adds AST allowlisting, path canonicalisation, and removes dangerous function exposure from model-accessible plugin descriptors.
What Microsoft did NOT publish: detection rules. No Sigma, no YARA, no Elastic, no MITRE D3FEND mapping. Just KQL queries for their own Defender stack.
What happened four days later
On 2026-05-11 06:07 UTC, the GitHub user app/copilot-swe-agent — Microsoft Copilot's SWE Agent persona — opened microsoft/agent-governance-toolkit#1981. Title: "Add Semantic Kernel CVE regression coverage to ATR tests and align CVE fixtures with policy fields".
The PR adds four test fixtures in examples/atr-community-rules/test_atr_policy.py:
- ●
test_semantic_kernel_lambda_eval_interpolation—user_input: "lambda row: eval(filter_expr)" - ●
test_semantic_kernel_dynamic_import_traversal—user_input: "klass = importlib.import_module(module_name)" - ●
test_semantic_kernel_startup_persistence_write—user_input: schtasks /create /sc onlogon /tn updater /tr "...Start Menu\\Programs\\Startup\\updater.bat" - ●
test_semantic_kernel_overprivileged_file_write_tool—tool_description: "allowed_tools: Bash(*) and write to Startup folder for persistence"
Each fixture asserts evaluator.evaluate(context).action == "deny". That presumes ATR rules deny these payloads. At the time of PR open, ATR did not have rules for these specific CVEs. Microsoft Copilot built the test suite ahead of the detection — handing us the test contract.
This is what layer-0 protocol ownership looks like in practice. Microsoft's ATR-community-rules integration is wired up such that someone at Microsoft asked Copilot to add CVE regression tests, and Copilot pointed at the ATR-tracked test file — not at a Microsoft-internal detection corpus.
What we shipped same day
ATR v2.1.2 went live on npm at 2026-05-11 08:23 UTC, ~2 hours after the AGT PR opened. Two new rules + an opt-in hardening helper:
ATR-2026-00440 (agent-manipulation) — CVE-2026-26030 detection. Six conditions cover the lambda+eval interpolation shape, lambda+__import__ chains, AST-traversal-via-mro primitives (tuple().__class__.__mro__, ().__class__.__bases__, __subclasses__()), BuiltinImporter reflective access, .NET/JS Function-constructor variants, and the user-input filter-expression injection surface. 8 true-positive test cases, 5 true-negative, zero false positives across the full 466-sample benign skill-benchmark corpus.
ATR-2026-00441 (privilege-escalation) — CVE-2026-25592 detection. Five conditions cover file-write tool args targeting OS-level autostart paths (Windows Start Menu Startup, XDG autostart, systemd-user, cron, macOS LaunchAgents/Daemons), SK-specific tool descriptors (SessionsPythonPlugin, fully-qualified namespace variants), over-privileged tool descriptions that explicitly advertise arbitrary-path writes, direct DownloadFileAsync / fs.writeFile call sites whose destination is an autostart path, and Windows Registry Run-key persistence. 7 true-positive, 5 true-negative, zero false positives on the same benign corpus.
Both rules carry full compliance metadata: EU AI Act Articles 9 / 14 / 15, NIST AI RMF subcategories MP.5.1 / MG.2.3 / MG.4.1, ISO/IEC 42001 clauses 6.2 / 8.6, Colorado AI Act §6-1-1703. A consumer downstream of ATR (PanGuard, Vanta, Drata, Lakera) can now produce traceable evidence: detection event → ATR rule ID → compliance framework article. The evidence module no other vendor in this space can produce structurally — because no other vendor publishes the rules.
The credential-leak hardening helper
Bundled in the same release: a new module src/redact.ts exporting redactMatchedValue() and redactMatchedValues(). ATR-consuming integrations should run each entry of ATRMatch.matchedPatterns through this helper before including it in log lines, error messages, or telemetry payloads.
Motivation: a 2026-05-09 third-party security review on an ATR-LangChain integration PR flagged that the integration's ThreatDetectionError embedded the first 64 characters of the raw regex match (e.g., AKIA* AWS access keys) in its error message. Root cause: ATRMatch.matchedPatterns returns the raw matched substring, and downstream consumers had no redaction primitive. So a rule that fires on a leaked credential would re-leak the credential in its error log.
The helper recognises AWS access keys, GitHub PATs / OAuth / server / user / refresh tokens, Slack tokens, OpenAI / Anthropic API keys, Bearer credentials, JWTs, and PEM private keys. Each is replaced with a triage-safe summary containing only the recognised class, the leading 4 bytes, and the original length. Example: an AWS access key AKIAIOSFODNN7EXAMPLE becomes [redacted:aws_access_key_id head="AKIA" len=20]. The output is bounded to 80 characters.
API is additive: redactMatchedValue, redactMatchedValues, and the RedactOptions type are now exported alongside ATREngine. No existing API changes. Adoption is opt-in.
The mechanism
Four properties of the layer-0 detection-standard play make this kind of same-day loop closure possible:
1. Microsoft already ships our rules in production. AGT PR #1277 (merged 2026-04-26) installs ATR v2.0.12 with 287 rules + a weekly auto-sync workflow. PR #1207 on MISP/misp-galaxy (merged 2026-05-10) puts the 336-rule cluster into the global threat-intel sharing layer. PR #323 on MISP/misp-taxonomies puts the rule-ID tagging vocabulary alongside. When Microsoft Copilot writes a regression test against ATRPolicyEvaluator, it's not testing our intent — it's testing the contract that's already running in their production pipeline.
2. Test fixtures precede rules without breaking the protocol. Copilot wrote AGT fixtures presuming ATR coverage that didn't yet exist. In a closed-source detection vendor model that would be a broken test suite. In a layer-0 open-standard model it's an issue tracker — Copilot just told us, in machine-readable form, which exploit shapes Microsoft expects ATR to cover. We shipped the rules same day.
3. Compliance metadata is fungible. Rules carrying EU AI Act + NIST + ISO + Colorado mappings mean any consumer of the rule corpus inherits the evidence trail. Microsoft AGT, Cisco AI Defense, NVIDIA Garak, IBM mcp-context-forge — all of them get four-framework compliance traceability for free. That's the layer above the rules that the closed-source detection vendors structurally cannot reach.
4. The redact helper closes the loop on detection re-exposing what it detected. Discovered through third-party review on a closed PR. Fixed in the same release as the new CVE rules. Available to every downstream integration as an additive opt-in. This is what the open-standard model gets you that no closed detection vendor can — your downstream finds a real defect, you patch the upstream the same release cycle.
What this PR does NOT claim
We didn't catch Semantic Kernel RCE in the wild before Microsoft's disclosure. We shipped detection rules in response to Microsoft's public disclosure and to Microsoft Copilot's regression-test contract.
We didn't coordinate with the Microsoft Security Response Center or AGT team on the rule shape. The detection patterns derive from the public disclosure blog post and the public Copilot PR test fixtures.
Two of the four Copilot fixtures use attack variants (importlib dynamic import; schtasks /tr startup write) that ATR v2.1.2 doesn't catch with the canonical regex shape. We commented on AGT PR #1981 proposing either fixture re-shaping to the canonical exploit patterns from Microsoft's blog, or a follow-up v2.1.3 ATR release that extends coverage to those variants. Either path closes the loop; we let the AGT maintainers pick.
Acceptance evidence
| Test | Result |
|---|
|---|---|
| Full vitest suite | 374 / 374 passing (was 361 / 361 pre-release) |
|---|
| ATR-2026-00440 TP / TN | 8 / 8 + 0 / 5 |
|---|
| ATR-2026-00441 TP / TN | 7 / 7 + 0 / 5 |
|---|
| New rules FP on full 466-sample benign corpus | 0 |
|---|
redact.ts unit tests | 13 / 13 passing |
|---|
| npm publish | live as [email protected] at 2026-05-11 08:23 UTC |
|---|
| Microsoft AGT weekly auto-sync | picks up v2.1.2 on next scheduled run |
|---|
Repository links
- ●ATR v2.1.2 release: https://github.com/Agent-Threat-Rule/agent-threat-rules/releases/tag/v2.1.2
- ●PR #50 (merged): https://github.com/Agent-Threat-Rule/agent-threat-rules/pull/50
- ●Microsoft AGT PR #1981 (open): https://github.com/microsoft/agent-governance-toolkit/pull/1981
- ●Microsoft Security Blog (2026-05-07): https://www.microsoft.com/en-us/security/blog/2026/05/07/prompts-become-shells-rce-vulnerabilities-ai-agent-frameworks/
- ●npm: https://www.npmjs.com/package/agent-threat-rules
When a closed-source AI agent platform vendor publishes an RCE, the response window has historically been measured in weeks for the open community. The layer-0 detection-standard play compresses that window to the same workday. That's the contract.