Skip to content

feat: Eval coverage grid generator #82

Description

@spboyer

Summary

Add a command (waza coverage or waza eval coverage) that generates a coverage matrix showing which skills have evals, which graders are used, and where gaps exist.

Motivation

As the number of skills grows, it becomes hard to track which skills have adequate eval coverage and which are untested. A coverage grid provides a quick visual summary — like a test coverage report but for eval suites.

Proposed Implementation

Command: waza coverage [--format text|json|markdown]

Scan the project for:

  1. All skills (from skills/ and .github/skills/)
  2. All eval.yaml files (from evals/ and --discover paths)
  3. Map skills → eval tasks → graders

Output (markdown format)

📊 Eval Coverage Grid
| Skill | Tasks | Graders | Coverage |
|-------|-------|---------|----------|
| azure-deploy | 5 | prompt, file, text | ✅ Full |
| azure-search | 2 | prompt | ⚠️ Partial |
| azure-storage | 0 | — | ❌ None |

JSON output for CI integration

{
  "total_skills": 24,
  "covered": 18,
  "partial": 4,
  "uncovered": 2,
  "coverage_pct": 75.0
}

Acceptance Criteria

  • waza coverage command scans skills and evals
  • Text, JSON, and markdown output formats
  • Identifies uncovered skills, partial coverage (few graders), and full coverage
  • Works with --discover to find evals across project layouts
  • Tests covering: no evals, partial, full coverage scenarios

Metadata

Metadata

Assignees

Labels

enhancementNew feature or requestpriority:p1This sprintsquad:linusAssigned to Linus (Backend Developer)

Fields

No fields configured for Feature.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions