Summary
Add a command (waza coverage or waza eval coverage) that generates a coverage matrix showing which skills have evals, which graders are used, and where gaps exist.
Motivation
As the number of skills grows, it becomes hard to track which skills have adequate eval coverage and which are untested. A coverage grid provides a quick visual summary — like a test coverage report but for eval suites.
Proposed Implementation
Command: waza coverage [--format text|json|markdown]
Scan the project for:
- All skills (from
skills/ and .github/skills/)
- All eval.yaml files (from
evals/ and --discover paths)
- Map skills → eval tasks → graders
Output (markdown format)
📊 Eval Coverage Grid
| Skill | Tasks | Graders | Coverage |
|-------|-------|---------|----------|
| azure-deploy | 5 | prompt, file, text | ✅ Full |
| azure-search | 2 | prompt | ⚠️ Partial |
| azure-storage | 0 | — | ❌ None |
JSON output for CI integration
{
"total_skills": 24,
"covered": 18,
"partial": 4,
"uncovered": 2,
"coverage_pct": 75.0
}
Acceptance Criteria
Summary
Add a command (
waza coverageorwaza eval coverage) that generates a coverage matrix showing which skills have evals, which graders are used, and where gaps exist.Motivation
As the number of skills grows, it becomes hard to track which skills have adequate eval coverage and which are untested. A coverage grid provides a quick visual summary — like a test coverage report but for eval suites.
Proposed Implementation
Command:
waza coverage [--format text|json|markdown]Scan the project for:
skills/and.github/skills/)evals/and--discoverpaths)Output (markdown format)
JSON output for CI integration
{ "total_skills": 24, "covered": 18, "partial": 4, "uncovered": 2, "coverage_pct": 75.0 }Acceptance Criteria
waza coveragecommand scans skills and evals--discoverto find evals across project layouts