Skip to content

Exploratory: PR315-derived candidate and looped-depth gate#453

Open
Divyesh-Thirukonda wants to merge 2 commits intoopenai:mainfrom
Divyesh-Thirukonda:codex/pr315-frontier-records
Open

Exploratory: PR315-derived candidate and looped-depth gate#453
Divyesh-Thirukonda wants to merge 2 commits intoopenai:mainfrom
Divyesh-Thirukonda:codex/pr315-frontier-records

Conversation

@Divyesh-Thirukonda
Copy link
Copy Markdown

@Divyesh-Thirukonda Divyesh-Thirukonda commented Mar 22, 2026

Status

This PR is exploratory and is not claiming a new leaderboard record.

The attached 8xH100 run for the main candidate is valid under the size cap, but it does not beat the existing PR315 frontier reference.

What Is In This PR

Two record folders:

  • records/track_10min_16mb/2026-03-22_11L_XSA4_EMA_PartialRoPE_LNScale_Entropy_LongDocTTT
  • records/track_non_record_16mb/2026-03-22_PR315_LoopedDepth_Gate

The first folder captures a PR315-derived candidate with experimental codec and TTT branches behind flags. The second folder keeps looped-depth gate work separate from the primary path.

Official 8xH100 Result For The Main Candidate

From train_seed42.log:

  • stopped at step 4625 on the 600.037s wallclock cap
  • peak memory 26152 MiB allocated / 26526 MiB reserved
  • total submission size 15,733,011 bytes
  • final_quant_roundtrip_exact val_bpb = 1.16892776
  • final_quant_sliding_window_exact val_bpb = 1.14586586

This is worse than the checked-in PR315 reference (1.1248 sliding-window val_bpb), so this PR should be treated as implementation and investigation work only.

Next Step Outside This PR

The actual leaderboard path is now a separate exact-reproduction effort: recover PR315 throughput and score parity on the official image first, then make one change at a time.

@Divyesh-Thirukonda Divyesh-Thirukonda changed the title Add PR315-derived candidate record and looped-depth gate Exploratory: PR315-derived candidate and looped-depth gate Mar 22, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants