Skip to content

Commit b35e4da

Browse files
aallanclaude
andcommitted
Address CR: timing table — mark K2 Turbo as historical, add K2.6 TBD row
The 'Timing expectations' table in scripts/README.md still showed 'Moonshot K2 Turbo | ~1.5 h' after the migration, which mismatched the sweep recipe that loops over moonshot/kimi-k2.6. Both rows now appear: K2.6 with TBD pending the first sweep, and K2 Turbo kept with an explicit 'historical; SKU deprecated 2026-05-25' annotation so the prior real-world data point isn't lost. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1 parent 3c97621 commit b35e4da

1 file changed

Lines changed: 5 additions & 3 deletions

File tree

scripts/README.md

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -109,10 +109,12 @@ Rough per-model totals observed on v0.0.9 (60 problems, 2026-04):
109109
| Claude Sonnet 4 | ~15 min |
110110
| GPT-4.1 / GPT-4o | ~10–12 min |
111111
| Moonshot K2.5 | ~3.5 h (slow provider; Aver especially) |
112-
| Moonshot K2 Turbo | ~1.5 h |
112+
| Moonshot K2.6 | TBD (no sweep against this model yet — see #68) |
113+
| Moonshot K2 Turbo *(historical; SKU deprecated 2026-05-25)* | ~1.5 h |
113114

114-
The full six-model sweep is dominated by the two Moonshot models. Expect
115-
5–8 hours end-to-end.
115+
The Moonshot models dominate the sweep wall-clock; expect 5–8 hours
116+
end-to-end. K2.6 timings will be filled in after the first full sweep
117+
once we have data to attribute.
116118

117119
### Output files
118120

0 commit comments

Comments
 (0)