Record: 10L Int6 QAT + Zstd MLP2.6x Muon0.99 Sliding Window (val_bpb 1.1598) by yahya010 · Pull Request #63 · openai/parameter-golf

yahya010 · 2026-03-19T07:19:15Z

Summary

11 techniques stacked on the Naive Baseline, achieving mean val_bpb 1.1598 (3 seeds):

10 transformer layers (from 9)
STE int6 QAT — fake quantization during training eliminates quant gap entirely (pre-quant = post-quant)
Full int6 quantization [-31,31] + zstd-22 compression
MLP hidden 1344 (2.625x model_dim) — enabled by int6+zstd savings
FP16 tied embedding passthrough
Sequence length 2048
Muon momentum 0.99 (warmup from 0.92 over 1500 steps)
Lower LRs: MATRIX_LR=0.02, SCALAR_LR=0.02
Gradient clipping 0.3
Warmdown 3600
Sliding window evaluation stride=64

Results

Seed	Steps	val_bpb (standard)	val_bpb (sliding)	Artifact size
1337	8,319	1.1821	1.1610	15,558,319
42	~8,300	~1.1815	1.1598	~15,558,000
3	~8,300	~1.1810	1.1586	~15,558,000

Mean val_bpb (sliding): 1.1598 (std: 0.00120)
Quant gap: 0.0000 — STE QAT completely eliminated quantization loss.

Statistical significance vs baseline (2.0727 val_loss):

Improvement: 0.1144 nats, t=-93.6, p << 0.01

Hardware: 8xH100 80GB HBM3, PyTorch 2.8.0+cu128, ~72ms/step.
Requires: pip install zstandard

Test plan

3 seeds on 8xH100, all under 600s wallclock
All artifacts under 16MB (15.56MB)
Sliding window eval under 600s (~370s)
Statistical significance p << 0.01
Post-quant roundtrip validation matches

- add a PR-audit research log entry covering the clean takeaways from pull requests openai#36 through openai#70 - promote long-context training plus matching long-context eval as a first-class clean branch based on PR openai#61 and PR openai#63 - refine mixed-precision export notes to emphasize using int6/int8 byte savings to fund wider MLP capacity, based on PR openai#65 - update the current snapshot and research thesis so future agents do not over-focus on exporter-only ideas after the broader PR sweep

yahya010 · 2026-03-19T20:21:06Z

Updated submission to val_bpb 1.1598 (3-seed mean, sliding window stride=64). Key techniques: 10L, STE int6 QAT (zero quant gap), full int6+zstd-22, MLP 1344, fp16 tied embedding, Muon 0.99, seq2048, grad clip 0.3. All constraints met (15.56MB artifact, 600s training, 370s eval). Ready for review.

3-seed validation: mean 1.2067 BPB (std 0.00044), improvement 0.0353 nats over baseline, t=-70.69 (p << 0.01). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

V3: Added 10th layer with mixed int8/int6 quantization (middle layers), plus sliding window evaluation (stride=64). 3-seed mean 1.1793 BPB, improvement 0.0815 nats over baseline, t=-137 (p << 0.01). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

V4b: Full int6 quantization [-31,31] + zstd-22 compression enables MLP expansion to 1344 (2.6x). Muon momentum 0.99, LR 0.02, grad clip 0.3. 3-seed mean 1.1632 BPB (sliding window), 0.1087 nats over baseline. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

V6b: Added straight-through estimator fake int6 quantization during training. Completely eliminates quantization gap (pre-quant = post-quant). 3-seed mean 1.1598 BPB (sliding window), beating previous leader (1.1605). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

0hq · 2026-03-19T22:28:09Z

Do you mind creating a new PR when you edit things? Not chronological credit note in FAQ. Otherwise keep as [WIP] or Draft PR until it's fully ready and locked.

yahya010 · 2026-03-19T22:42:29Z

Yes! Would you like me to remove the additional commits past the ready for review mark? Right now this is ready, so I won’t make any more changes here

Record: 10L Int6 QAT + Zstd MLP2.6x Muon0.99 Sliding Window (val_bpb 1.1598)

devin-ai-integration bot mentioned this pull request Mar 19, 2026

9L MLP3x + STE int6 QAT + ROPE=200K + warmdown=14K: val_bpb=0.9588 — 0.2656 nats over baseline andrewgcodes/parameter-golf#1

Open

5 tasks

0hq added the record submission ready for review label Mar 19, 2026

yahya010 changed the title ~~Seq2048 + FP16 Tied Embedding + Tuned LR (val_bpb 1.2067)~~ 10L Int6 QAT + Zstd MLP2.6x Muon0.99 Sliding Window (val_bpb 1.1598) Mar 19, 2026

yahya010 and others added 4 commits March 19, 2026 22:11

Add Seq2048 + FP16 Tied Embedding submission (mean val_bpb 1.2067)

34fccfb

3-seed validation: mean 1.2067 BPB (std 0.00044), improvement 0.0353 nats over baseline, t=-70.69 (p << 0.01). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

yahya010 force-pushed the submission/seq2048-fp16emb branch from d360e10 to 510e3f6 Compare March 19, 2026 22:11

yahya010 changed the title ~~10L Int6 QAT + Zstd MLP2.6x Muon0.99 Sliding Window (val_bpb 1.1598)~~ Record: 10L Int6 QAT + Zstd MLP2.6x Muon0.99 Sliding Window (val_bpb 1.1598) Mar 19, 2026

notapplica mentioned this pull request Mar 20, 2026

⛳ Parameter Golf Live AI Commentary ⛳ + Analysis / Ideas | every 10 minutes #140

Open

cocohearts merged commit 80f7a21 into openai:main Mar 20, 2026

leonardcser pushed a commit to leonardcser/parameter-golf that referenced this pull request Mar 21, 2026

Merge pull request openai#63 from yahya010/submission/seq2048-fp16emb

3a79188

Record: 10L Int6 QAT + Zstd MLP2.6x Muon0.99 Sliding Window (val_bpb 1.1598)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Record: 10L Int6 QAT + Zstd MLP2.6x Muon0.99 Sliding Window (val_bpb 1.1598)#63

Record: 10L Int6 QAT + Zstd MLP2.6x Muon0.99 Sliding Window (val_bpb 1.1598)#63
cocohearts merged 4 commits intoopenai:mainfrom
yahya010:submission/seq2048-fp16emb

yahya010 commented Mar 19, 2026 •

edited

Loading

Uh oh!

yahya010 commented Mar 19, 2026

Uh oh!

0hq commented Mar 19, 2026 •

edited

Loading

Uh oh!

yahya010 commented Mar 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

yahya010 commented Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Results

Test plan

Uh oh!

yahya010 commented Mar 19, 2026

Uh oh!

0hq commented Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yahya010 commented Mar 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

yahya010 commented Mar 19, 2026 •

edited

Loading

0hq commented Mar 19, 2026 •

edited

Loading