Record: Loqui Auris — 10L + SWA + Standard TTT (val_bpb=1.1100) by LoquiAuris · Pull Request #595 · openai/parameter-golf

LoquiAuris · 2026-03-24T02:56:33Z

Summary

val_bpb: 1.1100 (seed 1337)
10L d=512, 8 heads, 4 KV heads (GQA), MLP 3x, ReLU²
SWA (29 checkpoints), SmearGate, BigramHash(4096), U-Net skips
Standard AdamW TTT: 10 epochs, lr=0.0005
Artifact: 15.69 MB (250KB headroom)
Platform: 8xH100 SXM, ~5992 steps in 600s

Acknowledgments

Training stack: PR Record: Int6 MLP3x + SmearGate + BigramHash + MuonWD + SWA (mean val_bpb=1.1483) #162 (raahilshah), PR Record: 10L Int5-MLP + BigramHash(10240) + SWA(0.4) + WD=0.04 (val_bpb=1.1428, mean 3 seeds) #180 (thwu1)
TTT approach: PR [record bpb=1.195] sliding window + LoRA TTT #77 (samacqua), PR Record: 11L EMA + AdamW TTT 10ep (mean val_bpb=1.1027) #442 (sjp611)

valerio-oai · 2026-03-24T14:12:37Z

As far as I can tell here, this proposed TTT scheme trains on the validation set by reporting the score on a doc after its weights have adapted to it, rendering this unsound for the purposes of this competition. Specifically, you use #442's TTT scheme, which was ruled out.

LoquiAuris · 2026-03-24T14:50:13Z

Hi Valerio, Thank you for the clarification. Understood, I'll focus on architectural improvements going forward. Separate question: are compute grants still available? I submitted a request through the form and wanted to confirm it's still being processed. My current work is focused on non-TTT architecture optimizations (11L, improved quantization, activation functions) and additional compute would help with validation runs. Best regards, Eli Pancamo

…

On Tue, Mar 24, 2026 at 10:13 AM valerio-oai ***@***.***> wrote: *valerio-oai* left a comment (openai/parameter-golf#595) <#595 (comment)> As far as I can tell here, this proposed TTT scheme trains on the validation set by reporting the score on a doc after its weights have adapted to it, rendering this unsound for the purposes of this competition. Specifically, you use #442 <#442>'s TTT scheme, which was ruled out. — Reply to this email directly, view it on GitHub <#595?email_source=notifications&email_token=A6XCGYPLIQJJ6MBKYEUJKVL4SKJWZA5CNFSNUABFM5UWIORPF5TWS5BNNB2WEL2JONZXKZKDN5WW2ZLOOQXTIMJRHA3DGNZXGYYKM4TFMFZW63VGMF2XI2DPOKSWK5TFNZ2KYZTPN52GK4S7MNWGSY3L#issuecomment-4118637760>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/A6XCGYLEXX247QG5EWL62OD4SKJWZAVCNFSM6AAAAACW44FUROVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHM2DCMJYGYZTONZWGA> . You are receiving this because you authored the thread.Message ID: ***@***.***>

valerio-oai · 2026-03-24T15:46:23Z

Hi Eli, thanks for understanding, and don't take the closing of this PR as me trying to clip your wings! Compute grants are still available, they just take some time to come through :)

LoquiAuris · 2026-03-24T15:55:46Z

Valerio, Thanks for the info on the grants... And no worries, wings not clipped, wings are stronger.. I'm glad the challenge is back to being based on the core architecture where it really matters.. Kindest regards Eli

…

On Tue, Mar 24, 2026 at 11:46 AM valerio-oai ***@***.***> wrote: *valerio-oai* left a comment (openai/parameter-golf#595) <#595 (comment)> Hi Eli, thanks for understanding, and don't take the closing of this PR as me trying to clip your wings! Compute grants are still available, they just take some time to come through :) — Reply to this email directly, view it on GitHub <#595?email_source=notifications&email_token=A6XCGYOGR3BQ5ZVEP6MNWU34SKUWNA5CNFSNUABFM5UWIORPF5TWS5BNNB2WEL2JONZXKZKDN5WW2ZLOOQXTIMJRHEZTIMRSGMY2M4TFMFZW63VGMF2XI2DPOKSWK5TFNZ2KYZTPN52GK4S7MNWGSY3L#issuecomment-4119342231>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/A6XCGYKB46U6Q4BSPAYMYF34SKUWNAVCNFSM6AAAAACW44FUROVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHM2DCMJZGM2DEMRTGE> . You are receiving this because you authored the thread.Message ID: ***@***.***>

PR openai#595 achieves 1.1100 BPB with AdamW TTT (10ep, lr=5e-4). Add TTT_OPTIMIZER env var to switch between SGD (default) and AdamW. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Loqui Auris Inc added 2 commits March 23, 2026 22:55

Record: Loqui Auris — 10L + SWA + Standard TTT (val_bpb=1.1100)

e26ad36

Update README with detailed description for Standard TTT submission

7868f12

valerio-oai closed this Mar 24, 2026

0hq mentioned this pull request Mar 25, 2026

Illegal submissions megathread #677

Open

notapplica mentioned this pull request Mar 25, 2026

⛳ Parameter Golf Live AI Commentary ⛳ + Analysis / Ideas | every 10 minutes #140

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Record: Loqui Auris — 10L + SWA + Standard TTT (val_bpb=1.1100)#595

Record: Loqui Auris — 10L + SWA + Standard TTT (val_bpb=1.1100)#595
LoquiAuris wants to merge 2 commits intoopenai:mainfrom
LoquiAuris:loqui-10L-swa-ttt

LoquiAuris commented Mar 24, 2026

Uh oh!

valerio-oai commented Mar 24, 2026

Uh oh!

LoquiAuris commented Mar 24, 2026 via email

Uh oh!

valerio-oai commented Mar 24, 2026

Uh oh!

LoquiAuris commented Mar 24, 2026 via email

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

LoquiAuris commented Mar 24, 2026

Summary

Acknowledgments

Uh oh!

valerio-oai commented Mar 24, 2026

Uh oh!

LoquiAuris commented Mar 24, 2026 via email

Uh oh!

valerio-oai commented Mar 24, 2026

Uh oh!

LoquiAuris commented Mar 24, 2026 via email

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants