Skip to content

Record: 11L + order-adaptive 9-gram backoff (mean val_bpb=0.9059)#788

Open
hypery11 wants to merge 1 commit intoopenai:mainfrom
hypery11:submission/2026-03-25_champion
Open

Record: 11L + order-adaptive 9-gram backoff (mean val_bpb=0.9059)#788
hypery11 wants to merge 1 commit intoopenai:mainfrom
hypery11:submission/2026-03-25_champion

Conversation

@hypery11
Copy link
Copy Markdown

Results

Seed val_bpb
42 0.9067
1337 0.9059
2024 0.9050
Mean 0.9059
Std 0.0009
  • Artifact: 13.99 MB
  • Train: 600s on 8xH100 SXM
  • Eval: ~150s

Method

11-layer transformer with XSA-all, LeakyReLU(0.5)^2, Value Residual, Gated Attention. GPTQ-lite int6 + zstd-22.

Order-adaptive entropy-gated n-gram backoff cache (orders 2-9). Higher-order matches use lower entropy threshold for mixing. Score-first, deterministic, no TTT.

  • 8xH100 SXM, train <=600s
  • Eval <=600s (~150s)
  • Artifact <=16MB (13.99MB)
  • 3-seed validation (std 0.0009)

Seeds: 0.9067 / 0.9059 / 0.9050 (std 0.0009).
Order-adaptive entropy gating on 2-9 gram backoff.
13.99MB artifact. Train 600s, eval ~150s.
abaybektursun added a commit to abaybektursun/parameter-golf that referenced this pull request Mar 26, 2026
- Base model is ValCalib GPTQ (1.1142 BPB), not PR openai#549 (1.1194)
- Remove stale "not yet deployed" / "we estimate" for EXP-11
- Note α=0.80 (939s) exceeds 600s budget
- Fix PR openai#727 score to 0.9674, PR openai#788 to 0.9059
- Fix PR openai#596 BPB to 0.6430
- "Approved" → "Technique deemed legal" for closed PRs
- Add bucket sweep and per-token overhead proposal
- Replace "neural" with "base LM" throughout

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant