Skip to content

Hicache sparse coordinator#16984

Open
xiezhq-hermann wants to merge 29 commits intomainfrom
hicache/sparse_coordinator
Open

Hicache sparse coordinator#16984
xiezhq-hermann wants to merge 29 commits intomainfrom
hicache/sparse_coordinator

Conversation

@xiezhq-hermann
Copy link
Copy Markdown
Collaborator

@xiezhq-hermann xiezhq-hermann commented Jan 13, 2026

Motivation

Open a PR for recording comments for iteration

# Startup
# prefill
nohup python3 -m sglang.launch_server --model-path /sgl-workspace/mnt/Qwen-8B --tp-size 1 --host 0.0.0.0 --port 10000 --page-size 64 --mem-fraction-static 0.8 --attention-backend fa3 --base-gpu-id 0  --disaggregation-mode prefill > sglang.out &

#decode
nohup python3 -m sglang.launch_server --model-path /sgl-workspace/mnt/Qwen-8B --tp-size 1 --host 0.0.0.0 --port 10010 --page-size 64 --mem-fraction-static 0.8 --attention-backend fa3 --base-gpu-id 2 --enable-hierarchical-sparse-attention --disable-overlap-schedule --disable-cuda-graph  --hierarchical-sparse-attention-extra-config='{"algorithm": "quest", "backend": "fa3", "fixed_topk_page_cnt":16}' --disaggregation-mode decode > sglang1.out &

#minilb
cd /sgl-workspace/sglang && export PYTHONPATH="/sgl-workspace/sglang/sgl-router/py_src:$PYTHONPATH"
nohup python -m sglang_router.launch_router --pd-disaggregation --mini-lb --prefill http://0.0.0.0:10000 --decode http://0.0.0.0:10010 --host 0.0.0.0 --port 20000 > lb.out &

# GSM8K Accuracy
nohup python benchmark/gsm8k/bench_sglang.py --port 20000 --num-questions 10 --num-shots 12 --parallel 1 > test.out&
100%|██████████| 100/100 [16:25<00:00,  9.86s/it]
Accuracy: 0.933
Invalid: 0.000

Modifications

Accuracy Tests

Benchmarking and Profiling

Checklist

Review Process

  1. Ping Merge Oncalls to start the PR flow. See the PR Merge Process.
  2. Get approvals from CODEOWNERS and other reviewers.
  3. Trigger CI tests with comments or contact authorized users to do so.
    • /tag-run-ci-label, /rerun-failed-ci, /tag-and-rerun-ci
  4. After green CI and required approvals, ask Merge Oncalls to merge.

hzh0425 and others added 2 commits January 11, 2026 19:58
Co-authored-by: 晟海 <huangtingwei.htw@antgroup.com>
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

Comment thread python/sglang/srt/layers/attention/flashattention_backend.py Outdated
Comment thread python/sglang/srt/layers/attention/flashattention_backend.py Outdated
Comment thread python/sglang/srt/managers/cache_controller.py Outdated
Comment thread python/sglang/srt/managers/cache_controller.py Outdated
Comment thread python/sglang/srt/managers/scheduler_output_processor_mixin.py
)


def test_real_scenario():
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you provide a trace for benchmarking and performance tuning purpose? e.g., a collection of top k selection as input and a bechmark script measuring throughput.

Comment thread python/sglang/srt/disaggregation/decode.py
…sparse_coordinator2

# Conflicts:
#	python/sglang/srt/mem_cache/common.py
#	python/sglang/srt/model_executor/model_runner.py
@hzh0425 hzh0425 force-pushed the hicache/sparse_coordinator branch from 98ce637 to 94617da Compare January 26, 2026 18:06
@hzh0425 hzh0425 force-pushed the hicache/sparse_coordinator branch from 260f900 to 4248a1f Compare January 27, 2026 16:11
@hzh0425 hzh0425 force-pushed the hicache/sparse_coordinator branch from 71e9725 to 7c5b7aa Compare January 29, 2026 06:23
@hzh0425 hzh0425 force-pushed the hicache/sparse_coordinator branch from d1caeb7 to bdbb2f5 Compare February 2, 2026 11:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants