Non-record: QAT & EMA negative results on SOTA stack (val_bpb=1.1426) by MultiFe22 · Pull Request #360 · openai/parameter-golf

MultiFe22 · 2026-03-21T19:13:47Z

Summary

val_bpb: 1.1426 (baseline reproduction of PR Record: 10L Int5-MLP + BigramHash(10240) + SWA(0.4) + WD=0.04 (val_bpb=1.1428, mean 3 seeds) #180)
Documents that adding QAT (STE fake-quantization) and EMA to the SOTA stack hurts performance due to throughput loss
QAT costs 8% of training steps, EMA costs 32% — within the 10-minute budget, lost steps outweigh compression/averaging benefits

Results (8xH100 SXM, 600s)

Config	Steps	val_bpb	Artifact	Delta
Baseline (PR #180 repro)	6,684	1.1426	15.99 MB	—
+ QAT (warmup=500)	6,143	1.1473	15.69 MB	+0.005 (worse)
+ QAT + EMA	4,546	1.1606	16.89 MB	+0.018 (worse)

Key findings

QAT: Better compression (15.69 vs 15.99 MB) but 8% fewer steps - net negative
EMA: .cpu().clone() every step causes 32% throughput loss - catastrophic
Implication: Techniques trading steps for inference quality are counterproductive under the 10-min budget

Test plan

Baseline reproduction matches published SOTA (1.1426 vs 1.1428)
QAT ablation on 8xH100
QAT+EMA ablation on 8xH100
All logs included

Dropped QAT: 8% throughput penalty kills 600s budget (per PR openai#360). Three novel additions on thwu1 SOTA base (1.1428): - TrigramHash(20480, dim=32): trigram embedding signal, bigram 10240->4096 - XSA: orthogonal self-value removal, last 4 layers, from PR openai#287 - TTT: 3-epoch SGD on val tokens before eval, all ranks, ~47s budget Fixed rank bug: TTT runs on all 8 ranks independently (not rank 0 only) Artifact: ~15.64MB. Smoke tests passing. H100 validation pending.

Non-record: QAT and EMA negative results on SOTA stack (val_bpb=1.1426)

79ccfff

notapplica mentioned this pull request Mar 21, 2026

⛳ Parameter Golf Live AI Commentary ⛳ + Analysis / Ideas | every 10 minutes #140

Open

sahiee-dev mentioned this pull request Mar 22, 2026

GatedAttn + ValueResid + XSA6 + HedgeMixer + Legal TTT — val_bpb: 1.08965 (3-seed mean) #430

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Non-record: QAT & EMA negative results on SOTA stack (val_bpb=1.1426)#360

Non-record: QAT & EMA negative results on SOTA stack (val_bpb=1.1426)#360
MultiFe22 wants to merge 1 commit intoopenai:mainfrom
MultiFe22:first-try

MultiFe22 commented Mar 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

MultiFe22 commented Mar 21, 2026

Summary

Results (8xH100 SXM, 600s)

Key findings

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant