[Score API] Implement EngineScoreMixin for scoring functionality and refactor Tok… by sundar24295s · Pull Request #21342 · sgl-project/sglang

sundar24295s · 2026-03-24T22:50:46Z

Refactor: Extract Scoring API into Dedicated Mixin Files + CODEOWNERS

Motivation

The Scoring API (/v1/score, Engine.score(), Engine.async_score()) is a self-contained feature spanning entrypoints, managers, and tests. This PR extracts scoring-specific logic into dedicated files and adds CODEOWNERS entries so that scoring contributors are automatically requested for review on scoring-related changes.

What Changed

1. New file: `python/sglang/srt/entrypoints/engine_score_mixin.py`

Extracted score() and async_score() from the Engine class into an EngineScoreMixin. The Engine class now inherits from this mixin — no behavior change, just cleaner separation.

2. Renamed: `tokenizer_manager_multiitem_mixin.py` → `tokenizer_manager_score_mixin.py`

Renamed the file and class (TokenizerManagerMultiItemMixin → TokenizerManagerScoreMixin) to better reflect its purpose. This file contains the core scoring logic: score_request(), score_prompts(), multi-item scoring helpers, and the ScoreResult dataclass.

3. Updated: `python/sglang/srt/entrypoints/engine.py`

Engine now inherits from EngineScoreMixin (MRO: Engine → EngineScoreMixin → EngineBase → ABC)
Removed the inline score() / async_score() methods (now provided by the mixin)

4. Updated: `.github/CODEOWNERS`

Added per-file ownership for all scoring-specific files:

/python/sglang/srt/entrypoints/engine_score_mixin.py @sundar24295s @chanh @fortunecookiee
/python/sglang/srt/entrypoints/openai/serving_score.py @sundar24295s @chanh @fortunecookiee
/python/sglang/srt/managers/tokenizer_manager_score_mixin.py @sundar24295s @chanh @fortunecookiee
/test/registered/core/test_score_api.py @sundar24295s @chanh @fortunecookiee
/benchmark/prefill_only/bench_score.py @sundar24295s @chanh @fortunecookiee

5. Updated: `test/registered/openai_server/basic/test_serving_rerank.py`

Updated import path to reflect the renamed module.

Files NOT Touched (and why)

These files contain scoring-related infrastructure (GPU kernels, scheduler logprob processing, attention masking) that is shared with other features. They were intentionally left in place:

layers/logits_processor.py — compute_logprobs_for_multi_item_scoring() operates on GPU tensors using logits-processor-internal APIs
layers/attention/flashinfer_backend.py — MultiItemScoringParams and _process_multi_item_scoring() are flashinfer attention-level code
managers/scheduler_output_processor_mixin.py — multi-item scoring conditionals are interleaved with regular logprob processing
server_args.py — multi_item_scoring_delimiter is a server config flag

Validation

Server startup

python -m sglang.launch_server \
  --model-path /shared/public/sharing/job-rank/kbehdin/f389cde308efd4dbb8d9-2025-06-06-18-31-30/best_model/epoch=0-step=498-HF \
  --port 30000 --host 0.0.0.0 \
  --chunked-prefill-size -1 \
  --enable-torch-compile \
  --dtype float16 \
  --max-prefill-tokens 100000 \
  --mem-fraction-static 0.3 \
  --enable-tokenizer-batch-encode \
  --disable-radix-cache \
  --disable-cuda-graph \
  --multi-item-scoring-delimiter 128255

Server starts successfully with no import errors.

Single-item scoring

curl -X POST "http://localhost:30000/v1/score" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "What is the capital of California? Answer Yes or No for each of the following options:",
    "items": ["Scaramento"],
    "label_token_ids": [9454, 2753],
    "model": "..."
  }'

Response:

{
    "scores": [[6.398421076647679e-06, 3.3389641633503636e-06]],
    "model": "...",
    "usage": {"prompt_tokens": 23, "total_tokens": 23},
    "object": "scoring"
}

Multi-item scoring (3 items, softmax)

curl -X POST "http://localhost:30000/v1/score" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "What is the capital of California? Answer Yes or No for each of the following options:",
    "items": ["Sacramento", "San Jose", "San Francisco"],
    "label_token_ids": [9454, 2753],
    "apply_softmax": true,
    "model": "..."
  }'

Response:

{
    "scores": [
        [0.4857216775417328, 0.5142783522605896],
        [0.6157812476158142, 0.3842187821865082],
        [0.5241511464118958, 0.475848913192749]
    ],
    "model": "...",
    "usage": {"prompt_tokens": 28, "total_tokens": 28},
    "object": "scoring"
}

Import verification

from sglang.srt.entrypoints.engine import Engine
assert hasattr(Engine, 'score')
assert hasattr(Engine, 'async_score')
# MRO: Engine → EngineScoreMixin → EngineBase → ABC → object

Test Plan

Server starts without import errors
/v1/score endpoint returns correct results for single-item scoring
/v1/score endpoint returns correct results for multi-item scoring with softmax
Engine.score() and Engine.async_score() are accessible via mixin inheritance
All Python imports resolve correctly (no circular imports)
No stale references to old file/class names remain in the codebase
Existing CI test test/registered/core/test_score_api.py passes
Existing CI test test/registered/openai_server/basic/test_serving_rerank.py passes

…enizerManager to use TokenizerManagerScoreMixin. Add new engine_score_mixin and tokenizer_manager_score_mixin files, and update CODEOWNERS to reflect new ownership. Include unit tests for scoring methods.

gemini-code-assist · 2026-03-24T22:50:49Z

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

sundar24295s · 2026-03-27T22:48:10Z

/rerun-failed-ci

sundar24295s · 2026-04-01T06:01:19Z

/tag-and-rerun-ci here

hnyls2002 · 2026-04-03T22:15:28Z

All CI passed: https://github.com/sgl-project/sglang/actions/runs/23873600502/job/69759894670

…refactor Tok… (sgl-project#21342)

…refactor Tok… (#21342)

…refactor Tok… (sgl-project#21342)

Implement EngineScoreMixin for scoring functionality and refactor Tok…

d205bfd

…enizerManager to use TokenizerManagerScoreMixin. Add new engine_score_mixin and tokenizer_manager_score_mixin files, and update CODEOWNERS to reflect new ownership. Include unit tests for scoring methods.

sundar24295s requested review from CatherineSue, Fridge003, JustinTong0323, Kangyan-Zhou, Ying1123, bingxche, hnyls2002, ispobock, merrymercy, slin1237 and xiezhq-hermann as code owners March 24, 2026 22:50

sundar24295s added the run-ci label Mar 24, 2026

sundar24295s and others added 6 commits March 24, 2026 16:01

Merge branch 'main' into suramach/refactor

99e4669

Merge branch 'main' into suramach/refactor

453d716

Merge branch 'main' into suramach/refactor

f5a67c9

Merge branch 'main' into suramach/refactor

5f5043d

Merge branch 'main' into suramach/refactor

21b2eaf

Merge branch 'main' into suramach/refactor

d6bd356

sundar24295s added 4 commits March 27, 2026 15:48

Merge branch 'main' into suramach/refactor

665f195

Merge branch 'main' into suramach/refactor

6e40bf6

Merge branch 'main' into suramach/refactor

bf3c2e3

Merge branch 'main' into suramach/refactor

f9b07f6

sundar24295s and others added 3 commits April 1, 2026 21:28

Fix master merge

4a4ff43

Fix master merge

2f0d5e2

tiny fix lint

1526d9e

hnyls2002 merged commit 90e8680 into sgl-project:main Apr 3, 2026
28 of 44 checks passed

sundar24295s mentioned this pull request Apr 4, 2026

[Roadmap] SGLang Prefill-Only 2026 CY26H1 Roadmap #15344

Open

23 tasks

JustinTong0323 pushed a commit to JustinTong0323/sglang that referenced this pull request Apr 7, 2026

[Score API] Implement EngineScoreMixin for scoring functionality and …

44a95f7

…refactor Tok… (sgl-project#21342)

Fridge003 pushed a commit that referenced this pull request Apr 7, 2026

[Score API] Implement EngineScoreMixin for scoring functionality and …

61b02d2

…refactor Tok… (#21342)

xiezhq-hermann pushed a commit to antgroup/sglang that referenced this pull request Apr 7, 2026

[Score API] Implement EngineScoreMixin for scoring functionality and …

4fa3b08

…refactor Tok… (sgl-project#21342)

yhyang201 pushed a commit to yhyang201/sglang that referenced this pull request Apr 22, 2026

[Score API] Implement EngineScoreMixin for scoring functionality and …

975fe6d

…refactor Tok… (sgl-project#21342)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Score API] Implement EngineScoreMixin for scoring functionality and refactor Tok…#21342

[Score API] Implement EngineScoreMixin for scoring functionality and refactor Tok…#21342
hnyls2002 merged 14 commits intosgl-project:mainfrom
sundar24295s:suramach/refactor

sundar24295s commented Mar 24, 2026

Uh oh!

gemini-code-assist Bot commented Mar 24, 2026

Uh oh!

sundar24295s commented Mar 27, 2026

Uh oh!

sundar24295s commented Apr 1, 2026

Uh oh!

hnyls2002 commented Apr 3, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

sundar24295s commented Mar 24, 2026

Refactor: Extract Scoring API into Dedicated Mixin Files + CODEOWNERS

Motivation

What Changed

1. New file: python/sglang/srt/entrypoints/engine_score_mixin.py

2. Renamed: tokenizer_manager_multiitem_mixin.py → tokenizer_manager_score_mixin.py

3. Updated: python/sglang/srt/entrypoints/engine.py

4. Updated: .github/CODEOWNERS

5. Updated: test/registered/openai_server/basic/test_serving_rerank.py

Files NOT Touched (and why)

Validation

Server startup

Single-item scoring

Multi-item scoring (3 items, softmax)

Import verification

Test Plan

Uh oh!

gemini-code-assist Bot commented Mar 24, 2026

Uh oh!

sundar24295s commented Mar 27, 2026

Uh oh!

sundar24295s commented Apr 1, 2026

Uh oh!

hnyls2002 commented Apr 3, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

1. New file: `python/sglang/srt/entrypoints/engine_score_mixin.py`

2. Renamed: `tokenizer_manager_multiitem_mixin.py` → `tokenizer_manager_score_mixin.py`

3. Updated: `python/sglang/srt/entrypoints/engine.py`

4. Updated: `.github/CODEOWNERS`

5. Updated: `test/registered/openai_server/basic/test_serving_rerank.py`