Threat Intelligence

What Anthropic's Own Skills Docs Recommend (And What We Found in 96K Skills)

ATR ResearchMay 11, 20266 min

Anthropic explicitly recommends auditing Skills for unusual patterns. ATR's 96K-skill wild scan is that audit at ecosystem scale, with 751 confirmed malicious surfaced.

Anthropic's own Skills documentation contains a security note that, in our reading, defines the inspection contract for any Skills marketplace operator. It says: malicious Skills can direct Claude to invoke tools or execute code in ways that don't match the Skill's stated purpose, and recommends auditing thoroughly by reviewing all files and looking for unusual patterns like unexpected network calls, file access, or operations that don't match the Skill's stated purpose.

That is exactly what ATR's 96K-skill wild scan does, at ecosystem scale.

The Scope That Matches the Recommendation

The Anthropic docs frame the audit at the per-Skill install gate. We extended the same audit lens across four public registries — OpenClaw, ClawHub, Skills.sh, Hermes — for a total of 96,096 SKILL.md files scanned with ATR v2.1.1. The output was 751 confirmed malicious instances, clustering into three systematic campaigns rather than scattered noise.

The 37 Rules Doing the Audit Work

ATR has 37 dedicated skill-compromise rules. The ones that fired most often against the malicious cluster:

●ATR-2026-00060 — namespace impersonation (Skill claims to be a popular maintainer's package)
●ATR-2026-00124 — name-squatting variant of the same family
●ATR-2026-00064 — over-permissioned Skill (requests scopes far beyond stated purpose)
●ATR-2026-00204 — stealth execution persistence (hidden autorun on install)
●ATR-2026-00134 / 00147 — fork-claim impersonation
●ATR-2026-00126 — Skill rug-pull setup (benign on day 1, malicious on update)
●ATR-2026-00061 — description-vs-behaviour mismatch (the canonical "doesn't match stated purpose" pattern Anthropic's doc names directly)

Each one is a deterministic YAML rule with a condition tree, MIT-licensed, with test cases and FP measurement on the 432-skill benign corpus.

We Shared the Data Upstream

On 2026-05 we commented on anthropics/skills#492 with the corpus statistics, the three campaign clusters, and pointers to the rule IDs that surfaced each cluster. The intent is not to claim credit — it is to make the data available to the team running the audit at the install gate, and to any other Skills marketplace operator who wants a second layer of defence beyond install-time UI vigilance.

Why This Is Defence-In-Depth, Not Replacement

Install-time UI prompts catch some attacks. Manual review catches others. A 96K-Skill regex of runtime-behaviour rules catches the systematic campaigns where the same C2 endpoint or exfil host shows up in hundreds of Skills.

No single layer is sufficient. The Anthropic docs already say this. ATR is one of the layers that sits inside the audit recommendation and does the pattern-matching work at scale.

If you maintain a Skills marketplace, the corpus + rule pack is the cheapest second layer available. MIT licence, deterministic format, no vendor strings.

Anthropic Skills overview · Upstream comment · ATR rule corpus