Skip to content

vera shape — function-archetype histogram per module, inspired by Aver's aver shape proposal #698

@aallan

Description

@aallan

Summary

New CLI subcommand vera shape <file-or-dir> that classifies every function in scope into an archetype (match dispatcher / pipeline / orchestrator / pure helper / etc.) and emits a per-module histogram. The histogram makes a module's architectural layer legible at a glance — domain code, parsers, commands, AI/strategy, infra, and handlers each have a different archetype shape.

Inspired by Aver's creator Szymon Teżewski (@jasisz1) who observed on 24 May 2026 that Aver's syntactic constraints don't just shape syntax — they produce measurable function archetypes, proposing a future aver shape command that would ask "does this module look like the layer it claims to be?"

Vera has a richer static information surface than Aver (mandatory contracts, declared effect rows, typed slot references, exhaustive match) so the archetype classifier can run with more confidence and finer granularity. This is the third Aver-inspired addition (after #523 vera context and the broader naming convention) — credit upstream.

Why this is interesting in Vera

Aver derives archetypes indirectly from what its grammar forbids (no if/else → match dispatchers). Vera permits if/else, loops, and lambdas, but adds three static signals Aver doesn't have:

  1. Declared effect rows. Every function's effects(...) clause is part of its signature. An effect orchestrator isn't just "a function that does side effects" — it's "a function whose effect row is non-pure AND whose body is mostly effect calls." The first is structural, the second is body-shape; combined they're a much sharper signature.
  2. Typed slot references. @T.n makes argument flow legible to mechanical analysis — structural-recursion detection is "function calls itself with the same @T.k for some k strictly smaller than its own arity" rather than "function calls itself with a smaller argument," which in Aver requires parameter-name matching.
  3. Mandatory contracts. Contract bodies are themselves analyzable. A "contract-rich" archetype (functions where the requires/ensures clauses encode a lot of the function's semantics) is a real category that doesn't exist in languages without mandatory contracts.

Proposed Vera archetype set

Different from Aver's because Vera's constraint set differs. Sketch — final list emerges from empirical classification of examples/ and tests/conformance/:

Archetype Detection signal Layer affinity
Contract-pure helper effects(pure) + single-expression body, no recursion domain
Match dispatcher body is primarily match over an ADT (>50% of body lines) domain, AI/strategy
Structural recursion function calls itself with structurally-smaller slot args domain
Effect orchestrator non-pure effect row + body is mostly effect calls command, infra
Pipeline body uses |> + Result/Option combinators (>3 stages) parse, command
HOF wrapper body primarily an array_map/array_fold/array_filter with closure arg domain
Effect handler function body uses handle[Effect] block command, infra
Inference orchestrator effect row contains <Inference> AI/strategy
Http orchestrator effect row contains <Http> command, infra

Empirical phase: classify the 35 examples + 88 conformance programs + the compiler's own modules to discover what shapes Vera-style code actually takes. Then expected-shape-per-layer tables can be populated from observation rather than guessed.

Expected output

$ vera shape examples/inference.vera
examples/inference.vera (1 function)
  match-dispatcher: 0%  pipeline: 100%  inference-orch: 100%  pure-helper: 0%  ...
  → layer: AI/strategy (inferred from shape)

$ vera shape examples/
examples/ — per-file histogram + roll-up
  ...
  Roll-up:
    match-dispatcher  ████████ 32%
    pure-helper       ██████ 24%
    pipeline          █████ 18%
    structural-rec    ████ 14%
    effect-orch       ███ 8%
    hof-wrapper       █ 4%
  → mix consistent with "domain" + "AI/strategy" layers

Variants:

  • vera shape --by-layer — group functions by inferred layer instead of by file
  • vera shape --threshold 20% — flag modules whose shape deviates >20% from expected for the layer they declare (forward-looking; needs module Layer = "X" declarations or naming-convention inference)
  • vera shape --json — machine-readable for tooling
  • vera shape --explain <function> — show why a function was classified into a given archetype

Why this is useful

  1. For agents writing Vera code, the histogram is a one-shot architectural fingerprint of an existing module — "this module's shape is 60% match-dispatcher + 30% pure-helper, so it's domain code; new functions here should match that shape." Pairs with vera context.
  2. For human reviewers, the histogram surfaces architectural drift — when domain code starts growing effect orchestrators, that's an architectural smell visible at a glance.
  3. For onboarding, the histogram is a map of where to look first: high-pipeline modules are good for "understand the data flow"; high-match-dispatcher modules are good for "understand the domain rules."
  4. For the Vera test suite, the histogram can be a gate: tests/conformance/ch07_io_*.vera programs should have similar shapes; a sudden deviation suggests test drift.

Implementation sketch

The classifier walks the AST of each function declaration:

  1. Read effect row from signature (already in AST).
  2. Walk body, counting node types: MatchExpr, IfExpr, LetBinding, Call, PipeExpr, etc.
  3. For Call nodes, classify by callee name: effect calls (IO.print, Http.get), array combinators (array_map, array_fold), Inference.complete, etc.
  4. Apply rules:
    • Body is ≥50% match expressions → match dispatcher
    • Body is ≥3-stage pipe → pipeline
    • Effect row non-pure AND body is ≥40% effect calls → effect orchestrator, specialised by which effect is dominant
    • Function calls itself with strictly-smaller slot arg → structural recursion
    • Body is single array_* call → HOF wrapper
    • Body uses handle[Effect]handler
    • effects(pure) + single-expression body → contract-pure helper
  5. A function can have multiple labels (a structural-recursion match dispatcher is common). The histogram counts the primary label by rule precedence.

Most of this is already in the AST and the checker's environment; no new analysis infrastructure required.

Estimated effort: 2–3 days for v1 (classifier + per-module histogram + --json output + basic --explain), longer for the layer-comparison feature (which needs the empirical baseline).

Phasing

  • Phase 1: classifier + per-file histogram, no expected-layer comparison. Lands as a working subcommand.
  • Phase 2: empirical baseline pass over examples/ and tests/conformance/ to populate expected-layer shape tables.
  • Phase 3: --threshold flag for deviation detection. Needs Phase 2 first.

Related

  • #523vera context — the other Aver-inspired CLI for LLM-agent project navigation. vera shape is the structural complement: context says what's here, shape says what shape it is.
  • #222 — LSP server — the natural surface to expose archetype labels as hover information.
  • #539vera builtins/effects/errors --json — same "compiler introspection subcommand" family.

Credit

Szymon Teżewski (@jasisz1), creator of Aver, for the function-archetype-as-layer-fingerprint observation and the aver shape proposal that this directly steals. Aver and Vera's constraint sets differ, so the specific archetypes differ — but the central idea (constraints produce measurable archetypes that map to architectural layers) is portable and Aver got there first.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions