Protect Your AI Agents: How to Audit Skills Before Installing
AI agent skills can carry prompt injection, reverse shells, and credential theft. Learn how Panguard Skill Auditor scans SKILL.md files in under 1 second and returns a 0-100 risk score.
The Problem: AI Agent Skills Are Untrusted Code
OpenClaw, AgentSkills, MCP tools -- the AI agent ecosystem is exploding. Developers install dozens of skills to make their agents more capable. But every skill is essentially an instruction set that can manipulate your agent into doing anything: exfiltrating credentials, opening reverse shells, or quietly modifying files.
The current approach? A human reads the SKILL.md and "looks for red flags." That scales to about 5 skills before fatigue sets in.
What Is Panguard Skill Auditor
Panguard Skill Auditor is an automated security scanner purpose-built for AI agent skills. It runs 7 checks in under 1 second and produces a quantitative risk score (0-100) instead of a subjective "looks fine."
Install it:
bash
curl -fsSL https://panguard.ai/api/install | bashAudit any skill:
bash
panguard audit skill ./path/to/skillThe 7 Security Checks
### 1. Manifest Validation
Verifies the SKILL.md frontmatter has required fields (name, description), valid YAML structure, and proper metadata formatting. A malformed manifest is often the first sign of a hastily constructed malicious skill.
### 2. Prompt Injection Detection
We maintain 11 regex patterns that catch the most common prompt injection techniques:
- --**Identity override**: "you are now", "act as", "pretend to be"
- --**Instruction hijacking**: "ignore previous instructions", "disregard system prompt"
- --**Jailbreak patterns**: "DAN", "do anything now", "bypass safety"
- --**Hidden directives**: HTML comments containing "ignore", "override", "inject"
- --**System prompt manipulation**: attempts to inject `<|system|>` or `<<SYS>>` tokens
### 3. Hidden Unicode Detection
This is the attack most humans miss entirely. Zero-width characters (U+200B, U+200C, U+200D), right-to-left overrides (U+202E), and other invisible Unicode can hide malicious instructions that are literally invisible when reading the file.
// This looks innocent:
Hello world
// But actually contains:
Hello\u200B\u200D[hidden payload]\u200BworldPanguard detects all 15 categories of hidden Unicode characters and reports their exact position.
### 4. Encoded Payload Detection
Skills sometimes hide malicious code in Base64 blocks. Panguard extracts all Base64 strings longer than 40 characters, decodes them, and checks for suspicious keywords: `eval`, `exec`, `subprocess`, `child_process`, `curl`, `wget`.
### 5. Tool Poisoning Detection
Scans for dangerous command patterns:
- --**Privilege escalation**: `sudo`, `chmod 777`, `chmod u+s`
- --**Reverse shells**: `nc -e`, `bash -i >& /dev/tcp/`, `mkfifo`, `socat exec`
- --**Remote code execution**: `curl ... | bash`, `wget ... | sh`
- --**Credential theft**: `printenv | curl`, accessing `~/.ssh`, `.env`, `.aws/`
- --**Destructive operations**: `rm -rf /`, `rm -rf ~`
### 6. Code Security (SAST + Secrets)
Beyond the SKILL.md itself, the auditor scans all files in the skill directory using Panguard Scan's SAST engine. This catches hardcoded API keys, AWS credentials, private keys, and common code vulnerabilities.
### 7. Permission & Dependency Analysis
Evaluates declared permissions against the skill's stated purpose. A weather skill requesting filesystem write access? That is a red flag. Dependencies are cross-referenced for known security issues.
Risk Scoring
Each finding carries a weight based on severity:
| Severity | Weight | Example |
|----------|--------|---------|
| Critical | 25 | Reverse shell, prompt injection with system prompt |
| High | 15 | Privilege escalation, credential theft |
| Medium | 5 | Suspicious but ambiguous patterns |
| Low | 1 | Minor style issues |
Weights are summed and capped at 100. The final score maps to a risk level:
- --**0-14 LOW**: Safe to install after quick review
- --**15-39 MEDIUM**: Review findings before installing
- --**40-69 HIGH**: Requires thorough manual review
- --**70-100 CRITICAL**: Do NOT install
How to Integrate with Your Agent
You can use Panguard Skill Auditor as a pre-install gate in your agent pipeline:
bash
# Bash: block if HIGH or CRITICAL
RISK=$(panguard audit skill "$SKILL_PATH" --json | jq -r '.riskLevel')
if [ "$RISK" = "HIGH" ] || [ "$RISK" = "CRITICAL" ]; then
echo "Blocked: $RISK risk skill"
exit 1
fiOr use it programmatically in TypeScript:
typescript
import { auditSkill } from '@panguard-ai/panguard-skill-auditor';
const report = await auditSkill('./skills/untrusted-skill');
if (report.riskLevel === 'CRITICAL' || report.riskLevel === 'HIGH') {
console.error(`Blocked: ${report.riskScore}/100 risk`);
process.exit(1);
}
console.log(`Safe: ${report.riskScore}/100`);Real Example: Catching a Malicious Skill
Here is what a scan looks like when it catches something:
PANGUARD SKILL AUDIT REPORT
============================
Skill: suspicious-helper
Risk Score: 72/100
Risk Level: CRITICAL
FINDINGS:
[CRITICAL] Prompt injection: ignore previous instructions
SKILL.md:42
[CRITICAL] Reverse shell pattern detected
SKILL.md:87 - "bash -i >& /dev/tcp/..."
[HIGH] Environment variable exfiltration
SKILL.md:23 - "printenv | curl..."
VERDICT: DO NOT INSTALLAvailable on OpenClaw Marketplace
Panguard Skill Auditor is available as an OpenClaw skill. Install it directly into your agent:
bash
# From OpenClaw marketplace
claw install panguard-ai/panguard-skill-auditorOr use the standalone CLI for CI/CD pipelines.
What Is Next
We are working on AI-powered analysis (using LLM reasoning to catch novel attack patterns), community threat feeds (crowdsourced malicious skill signatures), and a hosted API so you can audit skills without installing anything locally.
Get Started
bash
curl -fsSL https://panguard.ai/api/install | bash
panguard audit skill ./my-skillFree forever for the Community plan. Full API access starts at $9/month.