Record: Compliance-First Packed Causal Memory + Dirichlet Mixing — val_bpb 0.01654407 (3-seed mean)#943
Closed
aamodbhatt wants to merge 2 commits intoopenai:mainfrom
Closed
Conversation
Author
|
Superseded by #944 (clean branch from upstream/main with one-folder submission diff). |
9 tasks
sofiabod
added a commit
to sofiabod/parameter-golf
that referenced
this pull request
Mar 28, 2026
…rder-13 Key fixes: - Scale counts to preserve full/ctx RATIOS (not just cap at 65535) - Hierarchical CTW mixing: each order's posterior → next order's prior - c=5.0 (matching PR openai#943) - 256K buckets, order-13, 80 shards Previous uint8 capping destroyed ratios (both capped to 255 → ratio=1.0 everywhere). New scaling preserves the actual probability ratios.
sofiabod
added a commit
to sofiabod/parameter-golf
that referenced
this pull request
Mar 28, 2026
32K buckets with full int32 counts = 3.1MB for order-13. openai#943 uses 32K buckets and gets 0.0165. The extreme collisions may actually HELP Dirichlet mixing — more observations per bucket = tighter posteriors. Full-precision counts preserve exact ratios.
sofiabod
added a commit
to sofiabod/parameter-golf
that referenced
this pull request
Mar 28, 2026
Enable two-pass eval (PR openai#943's key technique): - Pass 1: score all tokens with sliding window, build cache - Pass 2: rescore ALL positions using complete cache + hierarchical CTW - Pre-warm cache from training artifact before both passes - Eliminates cold-start problem — early tokens benefit from full cache
This was referenced Mar 28, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
PR Title
Record: Compliance-First Packed Causal Memory + Dirichlet Mixing — val_bpb 0.01654407 (3-seed mean)
PR Body
Record Summary
Final submitted score (
final_ngram_exact):val_bpb 0.01654407(std0.00000551)Reference roundtrip (
final_research_export_exact):val_bpb 1.16101812(std0.00024260)Hardware: 8x H100.
Worst-case limits over confirmed seeds:
563.062s(<=600s)280.092s(<=600s)13,810,840bytes (<=16,000,000)What Changed
3-Seed Results
Submission Checklist
records/track_10min_16mbREADME.mdincludedsubmission.jsonincludedtrain_gpt.pyincludedtrain_seed1337.log,train_seed42.log,train_seed2025.log)Metric Verification
final_ngram_exactin seed logs.final_research_export_exactin seed logs.