Skip to content

extract the get_block_size method#11820

Merged
hebiao064 merged 2 commits intosgl-project:bhe/1_stage_triton_kernelfrom
zminglei:triton-extract-method
Oct 19, 2025
Merged

extract the get_block_size method#11820
hebiao064 merged 2 commits intosgl-project:bhe/1_stage_triton_kernelfrom
zminglei:triton-extract-method

Conversation

@zminglei
Copy link
Copy Markdown
Collaborator

Motivation

extract the get_block_size method to reduce code duplication

Modifications

Accuracy Tests

python3 -m sglang.launch_server --model-path /shared/public/elr-models/Qwen/Qwen3-8B/2069b3fae1114555f3c020c81410e51fa0f656f2 --attention-backend triton --enable-deterministic-inference

python benchmark/gsm8k/bench_sglang.py --data-path /shared/public/data/gsm8k/test.jsonl
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 200/200 [00:20<00:00,  9.59it/s]
Accuracy: 0.950
Invalid: 0.000
Latency: 20.927 s
Output throughput: 1138.317 token/s

python3 -m sglang.test.test_deterministic --test-mode prefix --n-trials 50
Prompt 0 with prefix length 1: total samples: 291, Unique samples: 1
Prompt 1 with prefix length 511: total samples: 331, Unique samples: 1
Prompt 2 with prefix length 2048: total samples: 360, Unique samples: 1
Prompt 3 with prefix length 4097: total samples: 293, Unique samples: 1

Benchmarking and Profiling

Checklist

@zminglei zminglei marked this pull request as ready for review October 19, 2025 06:03
@hebiao064 hebiao064 merged commit 2589f84 into sgl-project:bhe/1_stage_triton_kernel Oct 19, 2025
2 checks passed
@hebiao064 hebiao064 self-assigned this Oct 19, 2025
@hebiao064 hebiao064 added the deterministic Issues on deterministic inference/kernels label Oct 19, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

deterministic Issues on deterministic inference/kernels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants