[Non-record] Eval-time Adaptation: Stride-OGD + Two-Pass + NTK-RoPE by kellyvv · Pull Request #241 · openai/parameter-golf

kellyvv · 2026-03-20T17:55:38Z

Three eval-time adaptation techniques

Stride-OGD: Online gradient descent on 1024-dim vocab bias, updated every stride (64 tokens). Exact gradient (no backprop), zero artifact cost, 16× faster feedback than TTT LoRA.

Two-Pass Eval: Pass 1 collects per-token gradients → Pass 2 re-scores with accumulated bias correction. Fits in 600s eval budget.

NTK-RoPE 4096: Eval at 4× context length without retraining via RoPE base rescaling.

All techniques include working code + synthetic demo. Run python eval_stride_ogd.py to verify.

Non-record: Eval-time Adaptation — Stride-OGD, Two-Pass, NTK-RoPE

2cb39d9

notapplica mentioned this pull request Mar 20, 2026

⛳ Parameter Golf Live AI Commentary ⛳ + Analysis / Ideas | every 10 minutes #140

Open

SkywardSyntax mentioned this pull request Mar 21, 2026

Non-record: 12L Low-Rank Q + QAT (1xH100, pre-quant 1.2035) #316

Open

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Non-record] Eval-time Adaptation: Stride-OGD + Two-Pass + NTK-RoPE#241

[Non-record] Eval-time Adaptation: Stride-OGD + Two-Pass + NTK-RoPE#241
kellyvv wants to merge 1 commit intoopenai:mainfrom
kellyvv:submission/eval-time-adaptation

kellyvv commented Mar 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

kellyvv commented Mar 20, 2026

Three eval-time adaptation techniques

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant