Skip to content

Evaluate Hypothesis for vera test property-based input generation #170

@aallan

Description

@aallan

Summary

Bookmark — evaluate adopting Hypothesis as an input-generation backend for vera test, complementing the Z3-based generator that already ships (#79, v0.0.47). This issue tracks the decision about which Hypothesis-shaped integration to adopt; the active-enhancement work it would block on (input shrinking, cross-function scenarios, coverage-guided generation) is split out as #562.

What Hypothesis would add

hypothesis is already a dev dependency but not yet used. It would complement the Z3-based approach in three places where Z3 is structurally weak:

  • Hypothesis strategies for types Z3 cannot encodeString, Array<T>, ADT, nested structures. Z3's string and sequence theories are limited; Hypothesis's strategy combinators handle these cleanly.
  • Custom strategies derived from requires() clauses — a requires(string_length(@String.0) > 0 && string_length(@String.0) < 100) clause translates naturally to a text(min_size=1, max_size=99) strategy.
  • Automatic shrinking — Hypothesis's shrinking is mature and would address one half of vera test: advanced testing features (input shrinking, cross-function scenarios, coverage-guided generation) #562's "input shrinking" feature directly.

Why this is bookmarked

The decision is gated on:

  1. Reaching the Z3 ceiling. As long as Vera programs under test are dominated by Int/Nat/Bool arithmetic, Z3 input generation is fine. The motivating case is when contracts on String / Array / nested ADTs become common enough that Z3's blind spots show up in vera test output.
  2. Picking the integration shape. Three plausible shapes: (a) replace Z3 wholesale with Hypothesis; (b) Hypothesis for non-encodable types only, Z3 for the rest; (c) layered — Z3 for the requires() constraint, Hypothesis for the type. (b) and (c) are both reasonable; (a) is probably wrong because Z3's counterexamples for Int arithmetic are excellent.
  3. Project-health check. Hypothesis is mature and well-maintained, but worth re-checking at adoption time (current dev-dep version is pinned in pyproject.toml).

Action when the time comes

When vera test's Z3 backend hits its ceiling on a real Vera program (sustained "cannot generate inputs" warnings on String/Array/ADT contracts):

  • Re-check Hypothesis project health
  • Pick integration shape (a / b / c above)
  • Spike: add Hypothesis strategy for one type (e.g. String) behind a feature flag
  • Compare counterexample quality vs Z3 on existing contract failures
  • If wins are meaningful, expand to other types and roll behind a CLI flag (vera test --backend hypothesis)

Follow-up from / Related

Credit

Original issue created as a single tracker covering both the Hypothesis-integration decision and the active-feature work that would consume it. Split during a 2026-05-01 issue-label audit so the two pieces could be tracked at their respective priorities. The active-feature half is now #562.

Metadata

Metadata

Assignees

No one assigned

    Labels

    C8.5C8.5 — CompletenessbookmarkSaving a link to another projectenhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions