[non-submission] CPU fallback for the poors (non-CUDA non-MLX) by jordankzf · Pull Request #14 · openai/parameter-golf

jordankzf · 2026-03-18T20:09:01Z

Changes:

train_gpt.py

Device auto-detection: CPU fallback instead of raise RuntimeError("CUDA is required")
Guards on torch.compile, fused Adam, flash SDP, DDP, torch.cuda.synchronize(), nvidia-smi
autocast(device_type=device.type, ...) instead of hardcoded "cuda"
(new) MAX_VAL_TOKENS env var to truncate the val set for faster local iteration

jordankzf · 2026-03-18T20:11:07Z

Training run on my AMD 7840U. Results from the smoke test:

5 training steps: 27 seconds (~5.4s/step)
Validation (1M token subset): ~2.5 minutes
Final int8 quantized model: 4.96MB
Loss 6.94 → 6.89 (I know, I know)

Mainly a POC, go ahead and ask OpenAI for that $25 Runpod credits.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

CPU fallback for non-CUDA

7e253d2

Ueaj-Kerman added a commit to Ueaj-Kerman/parameter-golf that referenced this pull request Mar 19, 2026

Update experiment log with scalar-scale results (openai#14)

0ae4879

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

0hq added the not ready for review label Mar 19, 2026

0hq closed this Mar 19, 2026

mrdavtan mentioned this pull request Mar 22, 2026

Non-record: Negative findings on codebook quantization, magnitude pruning, multi-token prediction, embedding factorization #212

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[non-submission] CPU fallback for the poors (non-CUDA non-MLX)#14

[non-submission] CPU fallback for the poors (non-CUDA non-MLX)#14
jordankzf wants to merge 1 commit intoopenai:mainfrom
jordankzf:jordankzf/cpu-fallback

jordankzf commented Mar 18, 2026

Uh oh!

jordankzf commented Mar 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jordankzf commented Mar 18, 2026

Uh oh!

jordankzf commented Mar 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants