Skip to content

feat: Schema versioning policy and migration tooling for public artifacts #368

Description

@spboyer

Problem

Multiple in-flight gap features mutate the public shapes of eval.yaml, results.json, transcripts, dashboard APIs, and (newly) snapshot.json. Without an explicit schema-version policy, every feature ships with a hidden risk of breaking older evals, baselines, and CI configs.

Examples already on the table:

If we don't pin this down now, the second time someone tries to compare a baseline results.json to a current one across a waza version bump, it'll silently misbehave.

Proposal

Adopt a single, documented schema-version policy for all public artifacts.

Versioned artifacts

Artifact Field Owner
eval.yaml schemaVersion (top-level) internal/models/spec.go
results.json schemaVersion internal/models/outcome.go
snapshot.json schemaVersion new (#367)
Dashboard/SSE event envelope schemaVersion web/, #178

Versioning rules

  • Semver-shaped: MAJOR.MINOR (no patch — schemas don't have hotfixes).
  • MINOR bumps are backward-compatible additions (new optional fields). Readers ignore unknown fields.
  • MAJOR bumps are breaking. Readers must refuse with a clear error and a pointer to a migration command.
  • Default-on unknown-field warnings (not errors) for MINOR drift.
  • waza migrate <file> command for explicit migrations across MAJOR boundaries.

Compatibility tests

  • internal/validation/ ships golden fixtures for each prior schema version.
  • CI test: every reader must parse every prior MINOR within the same MAJOR.

Why this matters

Eval suites are long-lived. Authors check in eval.yaml and baseline results.json and expect them to keep working. Without versioning, a routine waza upgrade silently changes meaning — the worst kind of break.

Acceptance criteria

  • schemaVersion field added to eval.yaml, results.json, and any new artifact (snapshot.json, gate output, SSE envelope).
  • Reader logic emits warnings on unknown fields within a MAJOR, errors across MAJOR.
  • waza migrate command stubbed (no-op for v1 → v1; real migration when first MAJOR bump happens).
  • Golden fixtures for each prior MINOR live in internal/validation/testdata/.
  • Compatibility tests in CI assert every reader handles every prior MINOR.
  • Policy documented in site/ with a "schema changes" changelog page.

Related

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions