Skip to content

Tier 6: PPM-C eval-time context mixer (standalone + neural mixing)#283

Open
Cwarren15-A wants to merge 1 commit intoopenai:mainfrom
Cwarren15-A:tier6-ppm-context-mixer
Open

Tier 6: PPM-C eval-time context mixer (standalone + neural mixing)#283
Cwarren15-A wants to merge 1 commit intoopenai:mainfrom
Cwarren15-A:tier6-ppm-context-mixer

Conversation

@Cwarren15-A
Copy link
Copy Markdown

@Cwarren15-A Cwarren15-A commented Mar 20, 2026

Summary

Classical Prediction by Partial Matching (PPM-C) context mixer for
eval-time probability blending with the neural model.

  • Standalone PPM-C order-2 achieves 3.41 BPB per-doc on FineWeb val
  • Fixed-alpha mixture (95% neural / 5% PPM) yields ~0.015 BPB improvement
  • Confidence-gated variant explored for per-token adaptive blending
  • Zero learned parameters, zero artifact size cost, ~60 LOC

Files

  • experiments/tier6/stage1_context_mixer.py — standalone PPM evaluator
  • experiments/tier6/stage2a_mixture_sweep.py — fixed-alpha mixture sweep
  • experiments/tier6/stage2b_confidence_gate.py — confidence-gated mixture
  • experiments/tier6/dump_neural_logprobs.py — utility to extract per-token neural log-probs

Results (2K sequence subset on L40S)

Config BPB
Neural only 1.2244
PPM per-doc (standalone) 3.41
PPM cumulative (standalone) 2.24
Neural + PPM per-doc alpha=0.95 ~1.209
Neural + PPM cumulative alpha=0.85 ~1.155

Test plan

  • stage1_context_mixer.py runs end-to-end on val set
  • stage2a sweep reproduces ~0.015 BPB gain at alpha=0.95 per-doc
  • Per-doc mode only for submission (rules-safe, no cross-document leakage)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant