Record Submission: 1.1078 BPB — XSA6 + BigramHash4K on Hedge Mixer Stack by agalimova · Pull Request #720 · openai/parameter-golf

agalimova · 2026-03-25T13:47:10Z

Summary

val_bpb: 1.1078 (3-seed mean, std 0.0045)
Seeds: 42=1.1045, 1337=1.1061, 2025=1.1129
Built on PR Record Submission: 1.0541 BPB - 5-expert Hedge Mixer + CROWN-Q + stride=64 #700 (@RoyiRa) with two hyperparameter changes found via systematic combinatorial search

Changes from PR #700

Parameter	Default	Ours
`XSA_LAST_N`	4	6
`BIGRAM_VOCAB_SIZE`	2048	4096

Test plan

3 seeds run on 8xH100 SXM (torch 2.9+cu126, FA3)
Mean improvement over merged SOTA (1.1194): -0.0116 BPB
All runs under 16MB artifact limit (15.3MB)
All runs under 600s training wallclock
Full training logs available (summaries included, full logs on request)

🤖 Generated with Claude Code

Built on PR openai#700 with hyperparameter improvements found via autoresearch-multi combinatorial search: - XSA_LAST_N=6 (extended from 4 to 6 layers) - BIGRAM_VOCAB_SIZE=4096 (doubled from 2048) 3-seed mean: 1.1078 (std 0.0045) Seeds: 42=1.1045, 1337=1.1061, 2025=1.1129 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

@agalimova

Built on PR openai#720 by @agalimova. Novel TTT recipe: - Per-layer LR groups (3x proj, 0.5x fc) - Cosine LR schedule within TTT - 4 epochs (vs 3), freeze 1 block (vs 2) - Skip sliding eval to reclaim time for extra epoch 3-seed results: Seed 1337: 1.0726 BPB (537s eval) Seed 42: 1.0635 BPB (546s eval) Seed 2025: 1.0806 BPB (531s eval) Mean: 1.0722 ± 0.009 All seeds: train <600s, eval <600s, artifact <16MB. Beats merged SOTA (1.1194) by 0.047.

@agalimova

Built on PR openai#720 by @agalimova. Key change: SGD TTT (lr=0.002, momentum=0.9) replaces AdamW, producing -0.041 BPB improvement. 3-seed results: Seed 1337: 1.0312 BPB (540s eval) Seed 42: 1.0503 BPB (533s eval) Seed 2025: 1.0535 BPB (544s eval) Mean: 1.0450 ± 0.012 All seeds: train <600s, eval <600s, artifact <16MB. Score-first legal TTT + backward-looking HedgeMixer.

@agalimova

Built on PR openai#720 by @agalimova. Key improvement: momentum 0.95 (vs 0.9) reduces variance and improves mean by 0.009 BPB. 3-seed results: Seed 1337: 1.0302 BPB (513s eval) Seed 42: 1.0365 BPB (533s eval) Seed 2025: 1.0419 BPB (539s eval) Mean: 1.0362 ± 0.006 Validated via comprehensive hyperparameter sweep: LR: 0.001/0.002/0.003 → 0.002 optimal Freeze: 0/1/2 → 0 optimal Epochs: 3/4/5 → 4 optimal Per-layer LR: 2x/3x/4x proj → 3x optimal Momentum: 0.9/0.95 → 0.95 optimal

valerio-oai · 2026-03-27T22:45:40Z

Thanks for your submission! Unfortunately, it's disallowed due to the use of hashed n-gram caches, which do not renormalize correctly / correctly reweight the LM's token distribution, look ahead to the target token to mix probabilities and therefore leak eval tokens. Please refer to the long discussion about this under the issues tab for more details, and please submit more runs in the future!

notapplica mentioned this pull request Mar 25, 2026

⛳ Parameter Golf Live AI Commentary ⛳ + Analysis / Ideas | every 10 minutes #140

Open

This was referenced Mar 26, 2026

GatedAttn + ValueResid + XSA6 + HedgeMixer + Legal TTT — val_bpb: 1.08965 (3-seed mean) #430

Closed

GatedAttn + ValueResid + XSA6 + HedgeMixer + Legal TTT — val_bpb: 1.08965 (3-seed mean) #824

Closed

dexhunter mentioned this pull request Mar 27, 2026

Record: 1.0722 BPB — Improved TTT + HedgeMixer with Per-Layer LR Groups #953

Closed

5 tasks

dexhunter mentioned this pull request Mar 27, 2026

Record: 1.0450 BPB — SGD TTT + HedgeMixer with Per-Layer LR Groups #967

Closed

valerio-oai closed this Mar 27, 2026

valerio-oai mentioned this pull request Mar 27, 2026

Illegal submissions megathread #677

Open

dexhunter mentioned this pull request Mar 28, 2026

Record: 1.0362 BPB — SGD Momentum 0.95 TTT + HedgeMixer + Per-Layer LR #995

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Record Submission: 1.1078 BPB — XSA6 + BigramHash4K on Hedge Mixer Stack#720

Record Submission: 1.1078 BPB — XSA6 + BigramHash4K on Hedge Mixer Stack#720
agalimova wants to merge 1 commit intoopenai:mainfrom
agalimova:submission/xsa6-bigram4k-hedgemixer

agalimova commented Mar 25, 2026

Uh oh!

valerio-oai commented Mar 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

agalimova commented Mar 25, 2026

Summary

Changes from PR #700

Test plan

Uh oh!

valerio-oai commented Mar 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants