[Non-record] Quantization Findings: SWA Reversal + Int5 Failure by kellyvv · Pull Request #238 · openai/parameter-golf

kellyvv · 2026-03-20T17:55:04Z

Two quantization findings

Finding 1: SWA reverses the quantization gap
After averaging 84 checkpoints, int6+zstd roundtrip BPB is LOWER than pre-quant BPB (1.5164 vs 1.5536, gap = -0.037). SWA smoothing eliminates quantization-sensitive outliers.

Finding 2: Int5 is catastrophic for undertrained models
Mixed int5/int6 quant gap explodes from 0.3 to 1.4 BPB (4.5×). Directly contradicts int5 viability for compute-constrained training.

See README for detailed analysis and reproduction steps.

- Add leaderboard table: jfprincz 1.1271 is new target; mohosy racing same stack - Add Reptile meta-TTT finding (PR openai#296): 10x better than naive TTT with SmearGate; error-guided TTT is negative; 13L crossover point identified - Add SWA checkpoint count finding (PR openai#238): 84 checkpoints reverses quant gap; explains why our WD=1200 SWA showed no effect - Update jfprincz entry to include PR openai#287 results (1.1271) - Add meta-lessons 10 and 11

Non-record: Quantization Findings — SWA Reversal + Int5 Failure

72b4293

notapplica mentioned this pull request Mar 20, 2026

⛳ Parameter Golf Live AI Commentary ⛳ + Analysis / Ideas | every 10 minutes #140

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Non-record] Quantization Findings: SWA Reversal + Int5 Failure#238

[Non-record] Quantization Findings: SWA Reversal + Int5 Failure#238
kellyvv wants to merge 1 commit intoopenai:mainfrom
kellyvv:submission/quantization-findings

kellyvv commented Mar 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

kellyvv commented Mar 20, 2026

Two quantization findings

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant