Skip to content

Record: 11L MLP3x + SmearGate + Error Correction Table#232

Open
kellyvv wants to merge 3 commits intoopenai:mainfrom
kellyvv:submission/correction-table
Open

Record: 11L MLP3x + SmearGate + Error Correction Table#232
kellyvv wants to merge 3 commits intoopenai:mainfrom
kellyvv:submission/correction-table

Conversation

@kellyvv
Copy link
Copy Markdown

@kellyvv kellyvv commented Mar 20, 2026

Novel technique: Error Correction Table

Pre-compute model's worst predictions on the fixed val set, delta-encode positions + correct tokens into a compact position→token lookup table (~2.87 MB). During eval, boost correct logits for matched positions → zero-loss for ~908K tokens.

Key Innovation

  • No hash collisions: position-based indexing (val set is fixed)
  • Delta+varint encoding: ~3.16 bytes/entry vs 6 bytes with hash-based approach
  • Built on-the-fly during eval: no separate build step needed (USE_CORRECTION=1)

Results (1×H100, extended training)

Metric Value
Base int6+zstd roundtrip 1.5164 BPB
+ Correction table 1.4370 BPB (-0.079)
Total artifact 15.15 MB ✅
Correction entries 907,927

Expected 8×H100 10min

Configuration Estimated BPB
Base model ~1.13
+ Correction table ~1.05

Eval command

# Training
torchrun --standalone --nproc_per_node=8 train_gpt.py

# Eval (auto-builds correction table + scores)
CHECKPOINT=final_model.int6.ptz USE_CORRECTION=1 python eval_final.py

Key files

  • train_gpt.py — 11L MLP3x SmearGate BigramHash STE-QAT SWA
  • eval_final.py — eval with inline correction table builder
  • build_correction_table.py — standalone correction table builder (optional)
  • records/track_10min_16mb/2026-03-20_CorrectionTable_11L_MLP3x/README.md — detailed results

Novel eval-time technique: pre-compute model's worst predictions,
delta-encode positions + correct tokens into compact lookup table.
During eval, boost correct logits for matched positions.

- 907,927 correction entries in 2.87 MB
- Baseline 1.5164 → 1.4370 BPB (-0.079)
- Total artifact: 15.15 MB (within 16 MB budget)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant