Context
Tier 5 problems test effect handling (State, Exn, IO). In Vera, this means algebraic effect handlers. But every other language solves these problems with native idioms:
- Python:
try/except, mutable variables
- TypeScript:
try/catch, closures
- Aver:
Result<T,E>, pure recursion
This means T5 run_correct isn't comparing the same capability across languages — Vera is testing effect-handler wiring while others are testing general error handling / state management.
Proposal
Report T1-T4 aggregate as the primary cross-language headline score, with T5 reported separately as "functional equivalents" that test language-specific mechanisms.
This was discussed in PR #48 (Aver support) with @jasisz, who offered to submit a follow-up PR for the reporting changes.
What needs changing
Relates to
- #48 — Aver support (where this was identified)
- #21 — Go support (will have same T5 mismatch)
- #49 — MoonBit support (same)
Context
Tier 5 problems test effect handling (State, Exn, IO). In Vera, this means algebraic effect handlers. But every other language solves these problems with native idioms:
try/except, mutable variablestry/catch, closuresResult<T,E>, pure recursionThis means T5 run_correct isn't comparing the same capability across languages — Vera is testing effect-handler wiring while others are testing general error handling / state management.
Proposal
Report T1-T4 aggregate as the primary cross-language headline score, with T5 reported separately as "functional equivalents" that test language-specific mechanisms.
This was discussed in PR #48 (Aver support) with @jasisz, who offered to submit a follow-up PR for the reporting changes.
What needs changing
vera_bench/report.py— add T1-T4 aggregate row alongside the existing all-tier rowvera_bench/metrics.py— add tier-filtered metric computationscripts/plot_results.py— update charts to show T1-T4 and T5 separatelyRelates to