Skip to content

MLX Timing Mismatch with Main Script#18

Merged
0hq merged 1 commit intoopenai:mainfrom
berniwal:main
Mar 18, 2026
Merged

MLX Timing Mismatch with Main Script#18
0hq merged 1 commit intoopenai:mainfrom
berniwal:main

Conversation

@berniwal
Copy link
Copy Markdown
Contributor

What

  • In the main "train_gpt.py" the evaluation does not count towards the 10 minute limit.
  • In the "train_gpt_mlx.py" it does. This PR aligns the logic of both scripts.

@0hq 0hq merged commit 5472f29 into openai:main Mar 18, 2026
@0hq
Copy link
Copy Markdown
Collaborator

0hq commented Mar 18, 2026

Thanks!

kxddry pushed a commit to kxddry/parameter-golf that referenced this pull request Mar 19, 2026
MLX Timing Mismatch with Main Script
mrdavtan added a commit to mrdavtan/parameter-golf that referenced this pull request Mar 21, 2026
- Restored from qat-sliding-window branch (was never merged forward)
- Updated SWA: v2 result was +0.0004 (no effect), now superseded by EMA
- Updated Moonshot: added v2 flat-loops result (5.58), scale argument
- Added Finding openai#15: Int5 catastrophic (gap 15x worse than int6)
- Added Finding openai#16: optimizer bug (SmearGate + BigramHash frozen in all prior runs)
- Added Finding openai#17: 11L step-count trap (83ms/step = 40% fewer steps)
- Added Finding openai#18: FA2 positive for step time, no quality effect
- Added Findings openai#19-22: XSA, EMA, TTT, NTK-RoPE (implemented, results pending)
- Updated 'tested by others' section with our implementation status
- Added meta-lessons: optimizer coverage, layer cost, merge window strategy
gb250e referenced this pull request in gb250e/parameter-golf Mar 21, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants