Skip to content

docs(gdn): document -1 padding index semantics for pool+indices path#3019

Merged
kahyunnam merged 1 commit into
flashinfer-ai:mainfrom
kaixih:doc/gdn-padding-index-semantics
Apr 17, 2026
Merged

docs(gdn): document -1 padding index semantics for pool+indices path#3019
kahyunnam merged 1 commit into
flashinfer-ai:mainfrom
kaixih:doc/gdn-padding-index-semantics

Conversation

@kaixih

@kaixih kaixih commented Apr 8, 2026

Copy link
Copy Markdown
Collaborator

Document the -1 padding index semantics for gated_delta_rule_decode_pretranspose's pool+indices path which is a subtle but important behavioral difference between the bf16 fast path (redirects to a sacrificial slot 0 null buffer, output undefined) and the float32 legacy path (skips entirely, output zeroed) that framework integrators must be aware of when handling inactive sequences.

cc. @kahyunnam @hlu1

Summary by CodeRabbit

  • Documentation
    • Clarified documentation regarding edge case handling and behavior across different computational backends, including how padding sequences are processed and what output to expect in specific scenarios.

Clarify per-backend behavior for negative indices in
gated_delta_rule_decode_pretranspose:
- bf16/MTP path: redirects -1 to slot 0 (null buffer), output undefined
- float32 legacy path: skips -1 entries entirely, output written as zero

Also note that the bf16 path requires slot 0 to be reserved as a
sacrificial null buffer (pool_size = num_real_slots + 1).

AI-assisted

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@coderabbitai

coderabbitai Bot commented Apr 9, 2026

Copy link
Copy Markdown
Contributor

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: e5136fd6-6872-4662-aa45-2695374a7287

📥 Commits

Reviewing files that changed from the base of the PR and between c2b4db2 and 1ec335f.

📒 Files selected for processing (1)
  • flashinfer/gdn_decode.py

📝 Walkthrough

Walkthrough

This PR updates documentation for gated_delta_rule_decode_pretranspose to clarify per-backend semantics for handling negative padding indices (-1). The BF16 fast path redirects -1 to slot 0 as a sacrificial null buffer, while the float32 legacy path skips -1 entries entirely with zero output. A Note section was revised to reflect that both paths support negative padding indices with documented semantics.

Changes

Cohort / File(s) Summary
Documentation Updates
flashinfer/gdn_decode.py
Expanded documentation for gated_delta_rule_decode_pretranspose clarifying per-backend handling of negative padding indices (-1): BF16 redirects to slot 0 (sacrificial buffer with undefined output), float32 skips entries (no state touched, zero output). Corrected prior claim that only float32 path supports negative indices.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~5 minutes

Possibly related issues

  • [RFC] Unified GDN Decode/Prefill API #2687: Directly addresses the per-backend semantics and padding (-1) handling in gated_delta_rule_decode_pretranspose (BF16 vs FP32) for negative-padding indices that this PR documents.

Possibly related PRs

Suggested reviewers

  • bkryu
  • yzh119
  • yongwww
  • kahyunnam

Poem

🐰 A note was penned with care so fine,
Of negative slots in a queued design,
BF16 borrows, float32 skips with grace,
Both paths now dance in their rightful place! ✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name Status Explanation Resolution
Description check ❓ Inconclusive The description explains the purpose of the documentation changes and the behavioral differences between backends, but does not follow the repository's pull request template structure. Follow the template format with sections for Description, Related Issues, Pre-commit Checks, and Tests to maintain consistency with repository standards.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the main documentation change: adding documentation for -1 padding index semantics in the pool+indices path of gated_delta_rule_decode_pretranspose.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the documentation for the gated_delta_rule_decode_pretranspose function to clarify how padding indices are handled across different backends. Specifically, it adds details on the sacrificial null buffer used in the bf16 fast path and the zeroing behavior in the float32 legacy path. Feedback was provided to refine the description of the legacy path to ensure technical accuracy regarding how the output slot is modified.

Comment thread flashinfer/gdn_decode.py
Comment on lines +181 to +182
neither the state pool nor the output are touched for that batch entry;
the output slot is written as **zero**.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The phrase "neither the state pool nor the output are touched" is slightly misleading because the output slot is indeed written to (it is explicitly zeroed by the kernel). It would be clearer to state that the state pool is not modified and the output is zeroed.

Suggested change
neither the state pool nor the output are touched for that batch entry;
the output slot is written as **zero**.
the state pool is not modified for that batch entry, and the output slot
is written as **zero**.

@hlu1 hlu1 left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@kahyunnam kahyunnam enabled auto-merge (squash) April 16, 2026 23:30
@vadiklyutiy

Copy link
Copy Markdown
Contributor

vLLM uses 0 as padded index

@kahyunnam kahyunnam merged commit 0e18a1c into flashinfer-ai:main Apr 17, 2026
56 of 64 checks passed
ziang-and pushed a commit to zianglih/flashinfer that referenced this pull request Apr 17, 2026
…lashinfer-ai#3019)

Document the `-1` padding index semantics for
`gated_delta_rule_decode_pretranspose`'s pool+indices path which is a
subtle but important behavioral difference between the bf16 fast path
(redirects to a sacrificial slot 0 null buffer, output undefined) and
the float32 legacy path (skips entirely, output zeroed) that framework
integrators must be aware of when handling inactive sequences.


cc. @kahyunnam @hlu1

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

* **Documentation**
* Clarified documentation regarding edge case handling and behavior
across different computational backends, including how padding sequences
are processed and what output to expect in specific scenarios.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants