feat: optional caller-supplied mm_hashes on GenerateReqInput by krishung5 · Pull Request #25300 · sgl-project/sglang

krishung5 · 2026-05-14T17:21:12Z

Motivation

External KV routers compute per-image hashes upstream and need sglang's
MultimodalDataItem.hash to align byte-for-byte so that:

pad_value = MM_PAD_SHIFT_VALUE + (item.hash % 2^30)

is deterministic from the caller's hash. With this, two requests
carrying the same image get the same image-token block in sglang's
RadixAttention cache, and the upstream router can land both on the
cache-warm worker.

Today sglang always recomputes hash = hash_feature(feature) inside
set_pad_value(), so the caller's hash and sglang's derived
pad_value are decoupled. Routing-side prefix-cache hits become a
coincidence rather than a contract.

What this PR does

Adds an optional mm_hashes: List[str] | None field on
GenerateReqInput (and matching kwarg on Engine.generate /
Engine.async_generate). When supplied:

tokenizer_manager parses each hex string into a u64 (first 16
chars) and seeds the corresponding MultimodalDataItem.hash.
set_pad_value() skips the internal hash_feature recompute
when hash is already set.

Backward compatibility

Default is None — no behavior change for any existing caller.
Length mismatch or per-item parse error falls back to the existing
hash_feature path so a malformed mm_hashes never blocks a request.

Tests

test/registered/unit/managers/test_mm_hashes.py pins:

GenerateReqInput.mm_hashes field shape (optional list of hex
strings) and that it defaults to None.
set_pad_value() honors a pre-set hash without calling
hash_feature (patched to raise if invoked).
pad_value is deterministic across items with identical preset
hashes; and distinct preset hashes produce distinct pad_values.

Why hex strings, not ints?

Wire formats for upstream routers tend to be JSON-friendly hex
strings (matches the vLLM-compatible mm_hash encoding). Strings also
forward-compat with hashes wider than u64 if sglang's pad_value
width grows.

CI States

Latest PR Test (Base): ❌ Run #26652245227
Latest PR Test (Extra): ❌ Run #26652244980

gemini-code-assist · 2026-05-14T17:21:16Z

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

gemini-code-assist · 2026-05-14T17:32:20Z

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

ishandhanani · 2026-05-19T03:48:49Z

/tag-and-rerun-ci

External KV routers (dynamo, custom orchestrators) sometimes compute their own per-image hash for routing decisions and need sglang's prefix-cache key to align. Today sglang always recomputes MultimodalDataItem.hash via hash_feature() inside set_pad_value, so the caller's hash and sglang's derived pad_value are decoupled. This change adds an optional `mm_hashes: List[str] | None` field on GenerateReqInput (and matching kwargs on Engine.generate/async_generate). When supplied, each MultimodalDataItem.hash is initialised from the list and set_pad_value() skips the internal recompute, so pad_value is deterministic from the caller's hash. Length mismatch or per-item parse error falls back to the existing hash_feature() path so a bad mm_hashes never blocks a request. Defaults to None; behavior is unchanged for current callers. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

stage-b-test-1-gpu-large isn't a valid CUDA suite name; CUDA suites use the base-a/b/c prefix. Switch to the stage="base-b" / runner_config="1-gpu-small" pattern other unit tests in this directory use.

Caught by sglang CI's psf/black 26.1.0 lint hook on PR sgl-project#25300. Pure whitespace; no behavior change.

Re-sort imports so `Modality` precedes `MultimodalDataItem` per isort alphabetical convention. Fixes the CI lint failure that fast-failed the rest of the test stages on PR sgl-project#25300. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…v0.5.11

ishandhanani · 2026-06-01T18:04:31Z

All relevant CI has passed

…ject#25300) Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: ishandhanani <82981111+ishandhanani@users.noreply.github.com>

Adds the concrete curl + filter + patch -p2 recipe to apply sgl-project/sglang#25300 (the mm_hashes interop hook) to a stock upstream sglang install. The dynamo sglang container ships upstream sglang without this patch, so MM-aware routing silently degrades to text-prefix fallback unless the patch is applied. For pytest, mirror the same recipe in pytest_collection_modifyitems gated on sglang MM-routing test collection. Idempotent — the grep short-circuits when sglang already exposes mm_hashes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: ishandhanani <82981111+ishandhanani@users.noreply.github.com>

…ject#25300) Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: ishandhanani <82981111+ishandhanani@users.noreply.github.com>

github-actions Bot added the sgl-kernel label May 14, 2026

krishung5 mentioned this pull request May 14, 2026

feat(sglang): MM-aware KV routing via pad_value substitution ai-dynamo/dynamo#9561

Merged

3 tasks

krishung5 force-pushed the feat/mm-hash-interop-v0.5.11 branch 2 times, most recently from 407285b to d2be361 Compare May 14, 2026 17:30

krishung5 marked this pull request as ready for review May 14, 2026 17:32

krishung5 requested review from CatherineSue, JustinTong0323, Ying1123, hnyls2002, ispobock, merrymercy, slin1237 and xiezhq-hermann as code owners May 14, 2026 17:32

krishung5 force-pushed the feat/mm-hash-interop-v0.5.11 branch 2 times, most recently from 3552cb7 to 0db5652 Compare May 15, 2026 01:42

github-actions Bot added the run-ci label May 19, 2026

krishung5 force-pushed the feat/mm-hash-interop-v0.5.11 branch from 0db5652 to 1c01579 Compare May 27, 2026 21:39

krishung5 and others added 3 commits May 27, 2026 14:46

test: fix CUDA suite registration for test_mm_hashes

3b4e96e

stage-b-test-1-gpu-large isn't a valid CUDA suite name; CUDA suites use the base-a/b/c prefix. Switch to the stage="base-b" / runner_config="1-gpu-small" pattern other unit tests in this directory use.

fix: black formatting for tokenizer_manager + test_mm_hashes

bd54658

Caught by sglang CI's psf/black 26.1.0 lint hook on PR sgl-project#25300. Pure whitespace; no behavior change.

mattteochen mentioned this pull request May 28, 2026

[GLM5] Improve NSA pcg perf #25642

Closed

5 tasks

ishandhanani and others added 2 commits May 29, 2026 11:50

Merge branch 'main' into feat/mm-hash-interop-v0.5.11

4b06669

Merge remote-tracking branch 'origin/main' into feat/mm-hash-interop-…

6e368ae

…v0.5.11

ishandhanani merged commit 86afa21 into sgl-project:main Jun 1, 2026
132 of 151 checks passed

amd-bot mentioned this pull request Jun 2, 2026

[CI Monitor] Daily Report - 2026-06-02 bingxche/sglang-ci-bot#91

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: optional caller-supplied mm_hashes on GenerateReqInput#25300

feat: optional caller-supplied mm_hashes on GenerateReqInput#25300
ishandhanani merged 6 commits into
sgl-project:mainfrom
krishung5:feat/mm-hash-interop-v0.5.11

krishung5 commented May 14, 2026 •

edited by github-actions Bot

Loading

Uh oh!

gemini-code-assist Bot commented May 14, 2026

Uh oh!

gemini-code-assist Bot commented May 14, 2026

Uh oh!

ishandhanani commented May 19, 2026

Uh oh!

ishandhanani commented Jun 1, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

krishung5 commented May 14, 2026 • edited by github-actions Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

What this PR does

Backward compatibility

Tests

Why hex strings, not ints?

CI States

Uh oh!

gemini-code-assist Bot commented May 14, 2026

Uh oh!

gemini-code-assist Bot commented May 14, 2026

Uh oh!

ishandhanani commented May 19, 2026

Uh oh!

ishandhanani commented Jun 1, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

krishung5 commented May 14, 2026 •

edited by github-actions Bot

Loading