Skip to content

feat: spec verify command (closes #361)#385

Merged
spboyer merged 3 commits into
mainfrom
spboyer-spec-verify-command
Jun 28, 2026
Merged

feat: spec verify command (closes #361)#385
spboyer merged 3 commits into
mainfrom
spboyer-spec-verify-command

Conversation

@spboyer

@spboyer spboyer commented Jun 28, 2026

Copy link
Copy Markdown
Member

Summary

  • Add waza spec verify with deterministic SKILL.md requirement extraction and eval task coverage reporting
  • Add opt-in semantic matching, CI fail/warn modes, and human/JSON/GitHub Actions output
  • Document the workflow in README, PRD, and site docs with CI examples

Closes #361

Validation

  • /opt/homebrew/bin/go test ./...
  • /opt/homebrew/bin/golangci-lint run
  • cd site && PATH=/opt/homebrew/bin:$PATH npm run build
  • /opt/homebrew/bin/go run ./cmd/waza spec verify examples/code-explainer/SKILL.md examples/code-explainer/eval.yaml --format human
  • /opt/homebrew/bin/go run ./cmd/waza spec verify examples/code-explainer/SKILL.md examples/code-explainer/eval.yaml --format json
  • /opt/homebrew/bin/go run ./cmd/waza spec verify examples/code-explainer/SKILL.md examples/code-explainer/eval.yaml --format github-actions --fail

Copilot AI review requested due to automatic review settings June 28, 2026 11:17

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a new waza spec verify CLI workflow that deterministically extracts requirements from SKILL.md, computes eval task coverage (with optional LLM-assisted semantic matching), and reports results in human/JSON/GitHub Actions formats. It extends the evaluation tooling in waza by making skill-contract drift visible and CI-gateable, aligning with issue #361’s “spec-to-test” verification goal.

Changes:

  • Add spec verify command and internal/specverify package for parsing SKILL.md requirements and mapping them to eval task coverage (deterministic + optional semantic).
  • Add tests covering parsing, deterministic coverage, CSV-backed tasks, and CLI behaviors.
  • Update README, PRD, and site docs to document the new command and CI usage.
Show a summary per file
File Description
site/src/content/docs/reference/cli.mdx Adds CLI reference docs for waza spec verify flags and examples.
site/src/content/docs/guides/spec-verify.mdx New guide explaining spec verification, worked example, and CI snippet.
site/src/content/docs/guides/ci-cd.mdx Adds a CI/CD section describing spec coverage checks with GitHub Actions annotations.
site/astro.config.mjs Adds “Spec Verification” to the site navigation.
README.md Documents waza spec verify usage and flags in the main README.
docs/PRD.md Adds PRD entry for verifying eval coverage against SKILL.md requirements.
internal/specverify/types.go Defines report/requirement/task types for spec verification output.
internal/specverify/parse.go Implements deterministic SKILL.md parsing into requirement IDs + source spans.
internal/specverify/semantic.go Adds optional semantic matcher using an execution engine as judge.
internal/specverify/coverage.go Implements eval task loading and requirement-to-task coverage computation.
internal/specverify/parse_test.go Tests deterministic extraction + spans and validates against existing corpus files when present.
internal/specverify/coverage_test.go Tests deterministic coverage, semantic response parsing, and CSV dataset task loading.
cmd/waza/root.go Wires the new spec command into the root CLI.
cmd/waza/cmd_spec.go Implements waza spec verify command, flags, output rendering, and semantic engine wiring.
cmd/waza/cmd_spec_test.go Adds CLI-level tests for presence, JSON output, and fail mode behavior.

Review details

  • Files reviewed: 15/15 changed files
  • Comments generated: 2
  • Review effort level: Low

Comment thread internal/specverify/coverage.go Outdated
Comment thread cmd/waza/cmd_spec.go
Copilot AI added 2 commits June 28, 2026 07:34
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@spboyer spboyer force-pushed the spboyer-spec-verify-command branch from 14543f2 to d84d807 Compare June 28, 2026 11:35
Copilot AI review requested due to automatic review settings June 28, 2026 11:35

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review details

  • Files reviewed: 15/15 changed files
  • Comments generated: 4
  • Review effort level: Low

Comment thread internal/specverify/coverage.go
Comment thread cmd/waza/cmd_spec.go
Comment thread cmd/waza/cmd_spec.go
Comment thread cmd/waza/cmd_spec.go Outdated
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@spboyer spboyer merged commit f0770af into main Jun 28, 2026
9 checks passed
@spboyer spboyer deleted the spboyer-spec-verify-command branch June 28, 2026 11:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: Natural-language requirements → executable evals (spec-to-test)

3 participants