Skip to content

b1rdmania/claude-skill-auditor

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

3 Commits
Β 
Β 
Β 
Β 

Repository files navigation

Skill Auditor

License: MIT Claude Code Version

A Claude Code skill that audits other Claude Code skills. Give it any SKILL.md and get a scored quality report with specific fixes β€” not just flags.


🎯 What it does

Scores a skill across 7 dimensions (0–10 each), identifies critical issues and improvements, and offers to apply fixes directly.

Score Verdict
> 8.0 βœ… Production-ready
6.0–8.0 ⚠️ Reliable, improvements optional
< 6.0 ❌ Needs significant work before use

πŸš€ Quick Start

Installation

mkdir -p ~/.claude/skills/skill-auditor
curl -o ~/.claude/skills/skill-auditor/SKILL.md \
  https://raw.githubusercontent.com/b1rdmania/claude-skill-auditor/main/SKILL.md

Usage

Just ask Claude to audit a skill:

"Audit this skill: ~/.claude/skills/my-skill/SKILL.md"
"Check my skill before I publish it"
"Will Claude follow this skill?"
"Score this skill"

After the audit, Claude will ask if you want the fixes applied directly to the file.


πŸ“Š How it works

flowchart TD
    A([Give it a skill]) --> A1{Input type}
    A1 -->|File path| B[Read SKILL.md]
    A1 -->|Skill name| B2[Look up in ~/.claude/skills/]
    A1 -->|Pasted content| B3[Use directly]
    B & B2 & B3 --> C

    C[Run 7-dimension audit] --> D

    subgraph D [7 Dimensions]
        D1[1. Frontmatter Quality]
        D2[2. Instruction Followability]
        D3[3. DRY β€” No Repetition]
        D4[4. KISS β€” No Over-Engineering]
        D5[5. Dead Content]
        D6[6. Tool & Dependency Risk]
        D7[7. Structure & Navigation]
    end

    D --> E[Score each 0–10\nCalculate mean]
    E --> F[Output: table + critical issues\n+ improvements + what's working]
    F --> G{Overall score}
    G -->|above 8| H[Production-ready]
    G -->|6–8| I[Reliable, fixes optional]
    G -->|below 6| J[Needs significant work]
    F --> K([Offer to apply fixes])
Loading

πŸ“‹ The 7 Audit Dimensions

radar
    title Audit Dimensions
    "Frontmatter" : 10
    "Followability" : 10
    "DRY" : 10
    "KISS" : 10
    "Dead Content" : 10
    "Dependency Risk" : 10
    "Structure" : 10
Loading
Dimension What it checks Deductions
1. Frontmatter Quality name, description completeness, trigger phrase count (need 5+) -2 per missing check
2. Instruction Followability Imperative language, passive deferrals, code-fence output traps, ambiguous placeholders -2 per deferral, -3 per code-fence trap, -1 per placeholder
3. DRY β€” No Repetition Repeated instructions, paraphrased re-statements, redundant sections -1 per repeat, -2 per section
4. KISS β€” No Over-Engineering Mode count (>4 = split), conditional complexity, premature flexibility -2 per excess mode, -1 per complex tree
5. Dead Content Unreferenced sections, placeholder content, tips that restate criteria -1 per dead section
6. Tool & Dependency Risk MCP requirements, shell binary assumptions, hardcoded URLs, assumed context -2 per critical risk, -1 per warning
7. Structure & Navigation Section headings, tables vs prose, linear flow, wall-of-text detection -1 per structure issue

πŸ’‘ Example Output

# Skill Audit Report: income-opportunity-scorer

Overall Score: 8.2/10 βœ… Production-ready

## Scores by Dimension

| Dimension | Score | Status |
|-----------|-------|--------|
| Frontmatter Quality | 10/10 | βœ… Perfect |
| Instruction Followability | 7/10 | ⚠️ Issues found |
| DRY β€” No Repetition | 9/10 | βœ… Clean |
| KISS β€” No Over-Engineering | 8/10 | βœ… Good |
| Dead Content | 9/10 | βœ… Minimal waste |
| Tool & Dependency Risk | 7/10 | ⚠️ Warnings |
| Structure & Navigation | 8/10 | βœ… Well-organized |

## πŸ”΄ Critical Issues (Fix before publishing)

1. **Code-fence output trap** (Line 87): Output template wrapped in triple backticks β€” Claude will render it as a code block instead of formatted output. Remove the code fence.

2. **Passive deferral** (Line 45): "Ask the user which mode to use" β€” skill should infer mode from context, not defer to user.

## 🟑 Improvements (Optional)

1. **Repeated instruction** (Lines 23, 67): "Explain the criterion in context" stated twice β€” merge into a single note.

2. **Dependency assumption** (Line 102): Assumes Read tool is available without checking β€” add fallback or document requirement.

## βœ… What's Working

- Strong frontmatter with 8 trigger phrases
- Clear imperative language throughout
- Good use of tables for scoring criteria
- Minimal mode complexity (3 modes, well-justified)
- Linear flow, easy to navigate

## Next Steps

Apply fixes? [Y/n]

πŸ› οΈ What it catches

The most common skill failure modes:

1. Code-fence output traps

❌ BAD: Output template in code fence
Output format:

Name: {name} Score: {score}

Claude renders this as a literal code block instead of filling in variables.

βœ… GOOD: Output template without code fence
Output format:

Name: {name}
Score: {score}

2. Passive deferrals

❌ BAD: Ask the user which mode to use
βœ… GOOD: Infer mode from context β€” single idea = auto-score, multiple = batch

3. Duplicate modes

❌ BAD: "portfolio audit" and "batch" as separate modes (same thing, different framing)
βœ… GOOD: One "batch" mode with examples for both use cases

4. Repeated instructions

❌ BAD: Same rule in "Overview" and "Execution steps"
βœ… GOOD: State once, reference from other sections

5. Ambiguous placeholders

❌ BAD: {variable} or [placeholder] syntax that Claude renders literally
βœ… GOOD: Clear template format or explicit variable substitution instructions

🎨 Features

  • βœ… 7-dimension framework β€” Covers frontmatter, followability, DRY, KISS, dead content, dependencies, structure
  • βœ… Explicit scoring rules β€” Deduction per issue type, not subjective judgment
  • βœ… Actionable fixes β€” Specific line numbers and replacement suggestions
  • βœ… Auto-apply option β€” Claude can fix issues directly if you approve
  • βœ… Three input modes β€” File path, skill name lookup, or pasted content
  • βœ… Production-grade threshold β€” 8.0+ = ready to publish, 6-8 = good enough, <6 = needs work

πŸ“š Use Cases

  • Before publishing β€” Audit your skill before sharing on GitHub or skill marketplace
  • Quality gate β€” Check skills before installing from untrusted sources
  • Learning tool β€” Understand what makes a skill followable vs ignored
  • Bulk audit β€” Score all installed skills to find low-quality ones

🀝 Contributing

Found a common skill failure mode not covered? Have ideas for new dimensions or scoring improvements?

  1. Fork the repo
  2. Create a feature branch (git checkout -b feature/new-dimension)
  3. Add your changes to SKILL.md
  4. Open a PR with examples

πŸ“„ License

MIT License β€” see LICENSE for details.


πŸ”— Links


Built with Claude Code πŸš€

About

Claude Code skill for auditing and scoring other Claude Code skills. Yeah I know it's a little bit human centipede.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors