Skip to content

Non-record: Prefix-Conditioned Suffix Diffusion — True Discrete Diffusion (diffusion_pll_bpb=1.8587)#905

Open
anthony-maio wants to merge 3 commits intoopenai:mainfrom
anthony-maio:submission/prefix-conditioned-suffix-diffusion
Open

Non-record: Prefix-Conditioned Suffix Diffusion — True Discrete Diffusion (diffusion_pll_bpb=1.8587)#905
anthony-maio wants to merge 3 commits intoopenai:mainfrom
anthony-maio:submission/prefix-conditioned-suffix-diffusion

Conversation

@anthony-maio
Copy link
Copy Markdown

Non-Record Submission: True Discrete Diffusion Model

Answers OpenAI's explicit request for text diffusion submissions. This is a genuine discrete diffusion model, not an AR model with diffusion-inspired loss.

Approach

  • Clean prefix conditions a diffused suffix via absorbing-mask corruption
  • Timestep embeddings and prefix/suffix role embeddings inform the model
  • Denoising loss computed only on corrupted suffix tokens
  • Evaluation via approximate prefix-conditioned diffusion pseudo-log-likelihood (PLL)

Results (8xH100 SXM, 600s)

  • diffusion_pll_bpb: 1.8587 (approximate, not directly comparable to AR BPB)
  • Steps: 2,398 at 250ms/step
  • Artifact: 13.7MB (under 16MB)
  • Training loss: 41.2 → 4.84 (converging but not converged)

Why This Matters

This is (to our knowledge) the first true discrete diffusion submission in the competition. The PLL evaluation is approximate and not directly comparable to AR BPB, but it demonstrates that discrete diffusion can be trained within the competition's constraints.

Key Limitations

  • Approximate PLL, not exact AR BPB — scores are not directly comparable
  • 250ms/step is slow (no torch.compile, no FA3)
  • Model would benefit significantly from longer training (unlimited compute track)

Credits

  • Discrete diffusion: MDLM (Sahoo et al.), SEDD (Lou et al.)
  • Baseline architecture: OpenAI Parameter Golf starter code

anthony-maio and others added 2 commits March 26, 2026 16:56
…ion)

True discrete diffusion with absorbing masks. Clean prefix conditions
the diffused suffix via timestep and role embeddings. Denoising loss
only on corrupted suffix tokens. Approximate prefix-conditioned
diffusion PLL evaluator. Answers OpenAI's "text diffusion" request.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings March 26, 2026 22:28
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 3 out of 5 changed files in this pull request and generated no comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants