Record: Backoff N-gram Cache + LeakyReLU(0.9)² (val_bpb=0.6678) by ibarrajo · Pull Request #806 · openai/parameter-golf

ibarrajo · 2026-03-26T03:47:28Z

Summary

val_bpb: 0.6678 (seed 1337, additional seeds pending)
Multi-order backoff n-gram eval cache (orders 2-7) with entropy-adaptive alpha mixing
Distributed cache pre-fill for multi-GPU coherence (rank 7 pre-fills 54M tokens in 68s)
LeakyReLU(0.9)² activation (~0.013 BPB improvement over relu²)
Neural base: 1.1371 BPB (sliding window), n-gram cache: 0.6678 BPB
Artifact: 8.6MB (well under 16MB limit)
8xH100 SXM, 7189 steps in 600s, eval in 200s

Key implementation details

Score-first legality: Every token scored under inference_mode() BEFORE cache update
Entropy-adaptive alpha: 0.05 + 0.55 * sigmoid(2*(H-4)) — no oracle/hindsight selection
Pre-fill: Each GPU rank pre-populates cache with all preceding tokens (pure numpy, no NCCL)
No pre-eval TTT — removed illegal pre-eval adaptation entirely

Results

Eval Method	val_bpb
Non-overlapping (post-quant)	1.1594
Sliding window (stride=64)	1.1371
N-gram cache (orders 2-7)	0.6678

Test plan

Validated on 1xH100 (0.8556 BPB with undertrained model)
Full run on 8xH100 SXM (0.6678 BPB)
2 additional seeds for statistical significance
Verify reproducibility from records/ folder

🤖 Generated with Claude Code

Multi-order backoff n-gram eval cache (orders 2-7) with entropy-adaptive alpha mixing and distributed cache pre-fill for multi-GPU coherence. Neural base 1.1371 BPB, n-gram cache drops to 0.6678. 8xH100 SXM, 7189 steps in 600s. Single seed (1337), additional seeds pending. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

MatoTeziTanka · 2026-03-26T14:24:50Z

Clean implementation — the distributed cache pre-fill solving the multi-GPU table fragmentation problem is a useful contribution, and the 8.6 MB artifact size gives you a lot of headroom.

Heads up: this currently has 1 seed. The leaderboard requires 3-seed validation with statistical significance for record claims. Just flagging so it's on your radar before review.

Disclosure: I use Claude Code CLI, Codex CLI, and Gemini Pro as tools in my workflow. Human first, AI-assisted.

ibarrajo · 2026-03-31T23:08:24Z

Closing: n-gram cache BPB scores are invalid due to normalization bug (only scores correct token without full-vocabulary normalization). N-gram caches were ruled illegal on March 27.

MatoTeziTanka mentioned this pull request Mar 26, 2026

PROTEUS+STYX — val_bpb 0.8495 (3-seed mean) — LeakyReLU(0.9)² + 5-gram Eval Cache #769

Closed

10 tasks

MatoTeziTanka mentioned this pull request Mar 26, 2026

⛳ Parameter Golf Live AI Commentary ⛳ + Analysis / Ideas | every 10 minutes #140

Open

ibarrajo closed this Mar 31, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Record: Backoff N-gram Cache + LeakyReLU(0.9)² (val_bpb=0.6678)#806

Record: Backoff N-gram Cache + LeakyReLU(0.9)² (val_bpb=0.6678)#806
ibarrajo wants to merge 1 commit intoopenai:mainfrom
ibarrajo:submission/ngram-cache-0.6678

ibarrajo commented Mar 26, 2026

Uh oh!

MatoTeziTanka commented Mar 26, 2026 •

edited

Loading

Uh oh!

ibarrajo commented Mar 31, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ibarrajo commented Mar 26, 2026

Summary

Key implementation details

Results

Test plan

Uh oh!

MatoTeziTanka commented Mar 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ibarrajo commented Mar 31, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

MatoTeziTanka commented Mar 26, 2026 •

edited

Loading