Skip to content

Record: Compliance-First Packed Causal Memory + Dirichlet Mixing — val_bpb 0.01654407 (3-seed mean)#943

Closed
aamodbhatt wants to merge 2 commits intoopenai:mainfrom
aamodbhatt:record-2026-03-27-compliance-dirichlet
Closed

Record: Compliance-First Packed Causal Memory + Dirichlet Mixing — val_bpb 0.01654407 (3-seed mean)#943
aamodbhatt wants to merge 2 commits intoopenai:mainfrom
aamodbhatt:record-2026-03-27-compliance-dirichlet

Conversation

@aamodbhatt
Copy link
Copy Markdown

PR Title

Record: Compliance-First Packed Causal Memory + Dirichlet Mixing — val_bpb 0.01654407 (3-seed mean)

PR Body

Record Summary

Final submitted score (final_ngram_exact): val_bpb 0.01654407 (std 0.00000551)

Reference roundtrip (final_research_export_exact): val_bpb 1.16101812 (std 0.00024260)

Hardware: 8x H100.

Worst-case limits over confirmed seeds:

  • train: 563.062s (<=600s)
  • eval: 280.092s (<=600s)
  • size: 13,810,840 bytes (<=16,000,000)

What Changed

  • Added packed causal n-gram memory path (built from train shards, loaded at eval start).
  • Added Dirichlet-normalized multi-order mixing and count-confidence gating.
  • Evaluated optional phrase-suffix expert; retained Dirichlet-only config as winner.

3-Seed Results

Seed final val_bpb roundtrip val_bpb train_s eval_s bytes_total
1337 0.01654988 1.16126036 563.035 275.583 13,801,440
42 0.01654339 1.16077516 563.033 277.124 13,810,840
2025 0.01653893 1.16101883 563.062 280.092 13,808,176
Mean 0.01654407 1.16101812 - - -
Std 0.00000551 0.00024260 - - -

Submission Checklist

  • One new folder added under records/track_10min_16mb
  • README.md included
  • submission.json included
  • train_gpt.py included
  • train logs included (train_seed1337.log, train_seed42.log, train_seed2025.log)
  • train and eval under 10 minutes
  • artifact under 16MB
  • no tokenizer/dataset edits
  • score-first ordering preserved (no hindsight path)

Metric Verification

  • Submission metric sourced from final_ngram_exact in seed logs.
  • Reference metric sourced from final_research_export_exact in seed logs.

@aamodbhatt
Copy link
Copy Markdown
Author

Superseded by #944 (clean branch from upstream/main with one-folder submission diff).

@aamodbhatt aamodbhatt closed this Mar 27, 2026
sofiabod added a commit to sofiabod/parameter-golf that referenced this pull request Mar 28, 2026
…rder-13

Key fixes:
- Scale counts to preserve full/ctx RATIOS (not just cap at 65535)
- Hierarchical CTW mixing: each order's posterior → next order's prior
- c=5.0 (matching PR openai#943)
- 256K buckets, order-13, 80 shards

Previous uint8 capping destroyed ratios (both capped to 255 → ratio=1.0 everywhere).
New scaling preserves the actual probability ratios.
sofiabod added a commit to sofiabod/parameter-golf that referenced this pull request Mar 28, 2026
32K buckets with full int32 counts = 3.1MB for order-13.
openai#943 uses 32K buckets and gets 0.0165. The extreme collisions may actually
HELP Dirichlet mixing — more observations per bucket = tighter posteriors.
Full-precision counts preserve exact ratios.
sofiabod added a commit to sofiabod/parameter-golf that referenced this pull request Mar 28, 2026
Enable two-pass eval (PR openai#943's key technique):
- Pass 1: score all tokens with sliding window, build cache
- Pass 2: rescore ALL positions using complete cache + hierarchical CTW
- Pre-warm cache from training artifact before both passes
- Eliminates cold-start problem — early tokens benefit from full cache
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant