chore(deps): Bump criterion from 0.5.1 to 0.7.0#52
Conversation
LabelsThe following labels could not be found: Please fix the above issues or remove invalid values from |
581239d to
00fd374
Compare
|
Dependabot couldn't fetch all your path-based dependencies. Because of this, Dependabot cannot update this pull request. |
7 similar comments
|
Dependabot couldn't fetch all your path-based dependencies. Because of this, Dependabot cannot update this pull request. |
|
Dependabot couldn't fetch all your path-based dependencies. Because of this, Dependabot cannot update this pull request. |
|
Dependabot couldn't fetch all your path-based dependencies. Because of this, Dependabot cannot update this pull request. |
|
Dependabot couldn't fetch all your path-based dependencies. Because of this, Dependabot cannot update this pull request. |
|
Dependabot couldn't fetch all your path-based dependencies. Because of this, Dependabot cannot update this pull request. |
|
Dependabot couldn't fetch all your path-based dependencies. Because of this, Dependabot cannot update this pull request. |
|
Dependabot couldn't fetch all your path-based dependencies. Because of this, Dependabot cannot update this pull request. |
b80e8f7 to
b4df57b
Compare
Bumps [criterion](https://github.com/bheisler/criterion.rs) from 0.5.1 to 0.7.0. - [Changelog](https://github.com/bheisler/criterion.rs/blob/master/CHANGELOG.md) - [Commits](bheisler/criterion.rs@0.5.1...0.7.0) --- updated-dependencies: - dependency-name: criterion dependency-version: 0.7.0 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com>
b4df57b to
f22c487
Compare
|
Superseded by #107. |
…RGED via 3/3 §68-trio flips (PMAT-CODE-SHIP-TWO-SECTION-70) (#1636) §69 (PR #1633) enumerated 4 candidate root causes for the apr eval HumanEval false-failure. §70 reports the empirical disambiguation on gx10 via the diagnostic surface (PR #1634), the 1-PR fix (PR #1635), and the discharge proof. 70.1 RC disambiguation on gx10 (canonical 7B Q4K APR teacher): - RC1 (state leak) : FALSIFIED — coherent 1031-byte response - RC2 (false-negative) : FALSIFIED — python3 actually exited 1 - RC3 (format!() bug) : CONFIRMED — imports stripped - RC4 (max_tokens trunc) : FALSIFIED — 524-char completion present 70.2 Why §68 was wrong: §68's R1+R2 0/3 flip rate on the known-failed trio was correct evidence; the inference ("Class B sampling/ quantization") was a leap. The TRUE class was Class C (harness-RC3), invisible to R1+R2 because R1+R2 doesn't touch the format!() at line 400. 70.3 The fix (PR #1635): new `extract_prompt_preamble(prompt, entry)` helper + ChatML-branch prepend in run_humaneval_inference. 7 unit tests cover the helper + RC3 falsifier. 70.4 Discharge proof — 3/3 §68 trio flip: | Task | §68 pre-fix | §68 R1+R2-only | §70 RC3-fix | | HumanEval/1 | FAIL | FAIL | PASS | | HumanEval/3 | FAIL | FAIL | PASS | | HumanEval/6 | FAIL | FAIL | PASS | Flip rate: 100%. 70.5 SHIP-005 path: 164-run dispatched on gx10 (commit b7e69bf); ~5h CPU wall. Discharge condition: post-fix pass@1 >= 84.80%. 70.6 Methodology lesson #17 NEW: pre-fix RED smoke can mask the bug class. A 0/N flip rate in a smoke proves only that the candidate fix doesn't move the needle, NOT that any specific failure class is responsible. The class must be identified via diagnostic instrumentation (APR_EVAL_DEBUG=1), not inferred from a flip rate. 70.7 Cumulative methodology lessons through §70 (lesson #17 added). 70.8 Ship-% movement: MODEL-1 stays 94% pending 164-run completion; path to 95% is single rerun + verdict check, no further code changes. MODEL-2 unchanged at 57%. Spec version: 3.14.0 → **3.16.0** (also reapplies §69 banner at v3.15.0 since PR #1633 has not yet landed on main — when #1633 lands, the §69 section will exist; this commit's banner stack accommodates that). Refs: - contracts/apr-eval-humaneval-harness-invariant-v1.yaml v1.1.0 - evidence/section-70-rc3-fix-2026-05-12/findings.json - /tmp/apr_eval_debug_HumanEval_{1,3,6}.json (gx10 evidence) - PR #1633 (§69 spec), PR #1634 (diagnostic surface), PR #1635 (RC3 fix) Closes task #52 (PMAT-CODE-SHIP-TWO-SECTION-70). Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Bumps criterion from 0.5.1 to 0.7.0.
Changelog
Sourced from criterion's changelog.
Commits
567405drelease: bump criterion and criterion-plot versions (#878)ccccbccfix: deal with throughput in bits (#861)deb0eb0feat: support throughput reports in bits (#833)d4fd7ccAdd CI job checking library builds with oldest allowed dependencies (#854)43bf90arelease version 0.6.0 (#860)92696e4deps: unpin clap (#858)5756a5dchore: bump MSRV to 1.80 (#859)9d887c0Fixed typo in faq.md (#852)59b791aci: test against MSRV and 1.87 (#857)ace1cc9Fix warnings from clippy (rust 1.87.0) (#856)Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting
@dependabot rebase.Dependabot commands and options
You can trigger Dependabot actions by commenting on this PR:
@dependabot rebasewill rebase this PR@dependabot recreatewill recreate this PR, overwriting any edits that have been made to it@dependabot mergewill merge this PR after your CI passes on it@dependabot squash and mergewill squash and merge this PR after your CI passes on it@dependabot cancel mergewill cancel a previously requested merge and block automerging@dependabot reopenwill reopen this PR if it is closed@dependabot closewill close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually@dependabot show <dependency name> ignore conditionswill show all of the ignore conditions of the specified dependency@dependabot ignore this major versionwill close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)@dependabot ignore this minor versionwill close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)@dependabot ignore this dependencywill close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)