Context
From the failure analysis in #6, several problems pass vera verify but produce wrong output because the postconditions are too weak to catch logic bugs. Specifically, slot-swap errors (computing b % a instead of a % b) satisfy ensures(@Nat.result >= 0) trivially.
Problems to strengthen
- VB-T4-002 (GCD): Currently
ensures(@Nat.result >= 0). Could add ensures(@Nat.result == 0 || @Nat.1 % @Nat.result == 0) to verify the result actually divides both inputs.
- VB-T4-004 (power): Currently
ensures(@Nat.result >= 0). Could add a stronger postcondition relating result to base and exponent.
- VB-T5-003 (safe_div): Currently
ensures(true). Could add ensures(@Int.0 != 0 ==> @Int.result == @Int.1 / @Int.0) to verify correct division when divisor is non-zero.
Expected impact
Stronger postconditions would cause vera verify to reject slot-swapped implementations, turning run_correct failures into verify failures. This gives earlier feedback and makes the verify@1 metric more meaningful — currently it's inflated because weak contracts pass trivially.
Trade-off
Stronger contracts make problems harder for the LLM in full-spec mode (more constraints to satisfy) and may affect Z3 verification times. Need to verify each strengthened contract actually passes vera verify on the canonical solution before committing.
Context
From the failure analysis in #6, several problems pass
vera verifybut produce wrong output because the postconditions are too weak to catch logic bugs. Specifically, slot-swap errors (computingb % ainstead ofa % b) satisfyensures(@Nat.result >= 0)trivially.Problems to strengthen
ensures(@Nat.result >= 0). Could addensures(@Nat.result == 0 || @Nat.1 % @Nat.result == 0)to verify the result actually divides both inputs.ensures(@Nat.result >= 0). Could add a stronger postcondition relating result to base and exponent.ensures(true). Could addensures(@Int.0 != 0 ==> @Int.result == @Int.1 / @Int.0)to verify correct division when divisor is non-zero.Expected impact
Stronger postconditions would cause
vera verifyto reject slot-swapped implementations, turning run_correct failures into verify failures. This gives earlier feedback and makes theverify@1metric more meaningful — currently it's inflated because weak contracts pass trivially.Trade-off
Stronger contracts make problems harder for the LLM in full-spec mode (more constraints to satisfy) and may affect Z3 verification times. Need to verify each strengthened contract actually passes
vera verifyon the canonical solution before committing.