Skip to content

Fix: score final partial window in sliding window eval#124

Merged
0hq merged 1 commit intoopenai:mainfrom
mattqlf:fix/sliding-window-tail-v2
Mar 19, 2026
Merged

Fix: score final partial window in sliding window eval#124
0hq merged 1 commit intoopenai:mainfrom
mattqlf:fix/sliding-window-tail-v2

Conversation

@mattqlf
Copy link
Copy Markdown
Contributor

@mattqlf mattqlf commented Mar 19, 2026

Fixes a bug in the sliding window evaluation from #50 where the final partial window (shorter than stride) was silently dropped, leaving up to stride-1 tokens unscored.

Two changes:

  • >= stride>= 1 in the window filter (include all windows with at least 1 token)
  • wlen - stridemax(wlen - stride, 0) to handle short final windows without negative indexing

The window_starts filter dropped windows shorter than stride,
silently skipping up to (stride-1) tokens at the end of the
validation set. Now includes all windows with >= 1 scoreable
token, and clamps the score start for short final windows.
@0hq 0hq merged commit 3a6fec7 into openai:main Mar 19, 2026
@0hq
Copy link
Copy Markdown
Collaborator

0hq commented Mar 19, 2026

Great, I'll add to the leaderboard.

scottspace pushed a commit to scottspace/parameter-golf that referenced this pull request Mar 21, 2026
The window_starts filter dropped windows shorter than stride,
silently skipping up to (stride-1) tokens at the end of the
validation set. Now includes all windows with >= 1 scoreable
token, and clamps the score start for short final windows.
leonardcser pushed a commit to leonardcser/parameter-golf that referenced this pull request Mar 21, 2026
The window_starts filter dropped windows shorter than stride,
silently skipping up to (stride-1) tokens at the end of the
validation set. Now includes all windows with >= 1 scoreable
token, and clamps the score start for short final windows.
nedcut pushed a commit to nedcut/parameter-golf that referenced this pull request Mar 26, 2026
The window_starts filter dropped windows shorter than stride,
silently skipping up to (stride-1) tokens at the end of the
validation set. Now includes all windows with >= 1 scoreable
token, and clamps the score start for short final windows.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants