Skip to content

feat(aprender-contracts): add actually_verified evidence field for kani_harnesses (lifts pv score D3 0.9 → 1.0) #1595

@noahgift

Description

@noahgift

Context

pv score currently caps the D3 (Kani) axis at 0.9 for harnesses with strategy: bounded_int, regardless of whether they actually pass cargo kani. The 0.9 figure is a static-readiness signal ("harness declared and implemented"); the runtime witness is whatever CI does with cargo kani.

apr-cookbook just landed PR #421 wiring kani-gate into CI for all 108 cookbook harnesses (39 base contracts + 69 architecture-demos). The gate runs cargo kani on every PR and fails red if any harness fails. But pv score doesn't read CI evidence, so the score still reads 0.9 even when the gate is green.

Proposed change

Add an actually_verified: bool field on each kani_harnesses[] entry, sourced from a kani-results.json artifact emitted by cargo kani. When actually_verified: true for every harness in a contract, pv score bumps D3 to 1.0 (or whatever the exhaustive strategy weight is, since "verified by Kani" is functionally exhaustive over the bounded domain).

Schema sketch

kani_harnesses:
  - id: KANI-PIPELINE-001
    obligation: "Pipeline composition is total"
    bound: 4
    strategy: bounded_int
    harness: "kani_harnesses::arch_resolution_pipeline_total"
    actually_verified: true   # NEW — set by `pv kani-evidence import`
    verified_at: "2026-05-09T13:42:00Z"  # NEW — optional, for staleness checks

Reading the evidence

cargo kani --output-format=json (or the --cbmc-args / --coverage outputs) emits a structured result per harness. A new pv kani-evidence import <kani-results.json> would update the YAML in-place, setting actually_verified: true for harnesses that passed.

Score change

In crates/aprender-contracts/src/scoring/mod.rs, compute_kani_coverage would short-circuit:

if h.actually_verified {
    return 1.0;  // or the existing exhaustive weight
}
match h.strategy.as_ref() {
    Some(KaniStrategy::Exhaustive) => 1.0,
    Some(KaniStrategy::BoundedInt) => 0.9,
    ...
}

Acceptance criteria

  • Schema field actually_verified: bool added to kani_harnesses[] entries (default false)
  • pv kani-evidence import <kani-results.json> command (or equivalent) updates contracts in-place
  • pv score reads actually_verified and bumps D3 to 1.0 when set
  • Documentation explains the static-readiness vs runtime-verification split (cookbook's docs/specifications/kani-ci.md describes this framing)
  • cookbook can remove the "0.9 D3 ceiling" backlog note from architecture-demos/tickets.md once consumed

Unblock impact

  • All 24 architecture-demos contracts (108 cookbook contracts total) move from 0.98 Grade A to 1.0 perfect score, honestly attributed to actual Kani runtime verification
  • Provides a recipe for other repos that ship Kani harnesses — write the harnesses, wire CI, import evidence, score reflects reality

Cookbook reference

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions