Introduce CUDA graph debug mode with breakable CUDA graph by cctry · Pull Request #19102 · sgl-project/sglang

cctry · 2026-02-21T03:18:39Z

Introduce Breakable CUDA Graph — a lightweight mechanism to insert graph breaks into CUDA graph capture. Marked operations run eagerly between captured graph segments, while the rest stays graph-captured.

CUDA graph debug mode

# Debug mode: all ops run eagerly through graph capture/replay path
python -m sglang.launch_server --model meta-llama/Llama-3-8B --debug-cuda-graph

# Selective graph breaks in model code
from sglang.srt.model_executor.breakable_cuda_graph.breakable_cuda_graph import non_graph

@non_graph(enable=True)
def my_dynamic_op(x):
    return some_incompatible_op(x)

Breakable CUDA graph

when enable SGLANG_USE_BREAKABLE_CUDA_GRAPH, the decode graph is breakable. The overhead is minimal if no graph break inserted.

gemini-code-assist · 2026-02-21T03:18:42Z

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

cctry · 2026-02-21T03:19:11Z

/tag-and-rerun-ci

ch-wan

Left some minor comments. Also, can we also apply --debug-cuda-graph to piecewise cuda graph?

BBuf · 2026-03-20T10:07:24Z

Can #20910 solve the debug issue you encountered?

- Fix shared mutable ContextVar default ([] -> None) to prevent cross-context leaks - Fix structured output writeback with _copy_output for dataclasses/dicts/tensors - Fix replay no-break path using destroyed graph handle (last_graph -> last_graph_exec) - Add thread-safe wait_stream hook with lock + refcount - Add graph exec cleanup in __del__ to prevent GPU resource leaks - Add HIP/ROCm guards in server_args and cuda_graph_runner - Add clear error messages for missing cuda-python and incompatible modes - Rename non_graph -> eager_on_graph, BreakableCUDAGraphContext -> BreakableCUDAGraphCapture - Add __init__.py for breakable_cuda_graph package - Add unit tests (11 tests covering capture/replay, breaks, _copy_output, break_graph) - Add documentation (docs/advanced_features/breakable_cuda_graph.md) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…t#19102) Co-authored-by: Cheng Wan <54331508+ch-wan@users.noreply.github.com> Co-authored-by: Cheng Wan <chwan@rice.edu> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

…t#19102) Co-authored-by: Cheng Wan <54331508+ch-wan@users.noreply.github.com> Co-authored-by: Cheng Wan <chwan@rice.edu>

init

5710356

cctry requested review from Fridge003, Ying1123, hnyls2002, ispobock and merrymercy as code owners February 21, 2026 03:18

github-actions Bot added the run-ci label Feb 21, 2026

cctry assigned ch-wan and cctry Feb 21, 2026

ch-wan reviewed Feb 21, 2026

View reviewed changes

Comment thread python/sglang/srt/model_executor/breakable_cuda_graph/breakable_cuda_graph.py Outdated

Comment thread python/sglang/srt/model_executor/breakable_cuda_graph/cuda_utils.py Outdated

Comment thread python/sglang/srt/model_executor/breakable_cuda_graph/breakable_cuda_graph.py Outdated

ch-wan and others added 3 commits February 21, 2026 04:25

cleanup

4857f5e

fix lint

c08107d

Merge branch 'main' into shiyang/breakable_cg

1d81e96

BBuf reviewed Mar 20, 2026

View reviewed changes

Comment thread python/sglang/srt/server_args.py

BBuf reviewed Mar 20, 2026

View reviewed changes

Comment thread python/sglang/srt/model_executor/breakable_cuda_graph/breakable_cuda_graph.py Outdated

BBuf reviewed Mar 20, 2026

View reviewed changes

Comment thread python/sglang/srt/model_executor/breakable_cuda_graph/breakable_cuda_graph.py Outdated

Oasis-Git mentioned this pull request Apr 7, 2026

[Experimental] Breakable Piecewise Cuda Graph #22218

Merged

5 tasks

ch-wan and others added 4 commits April 9, 2026 00:50

Merge branch 'main' into shiyang/breakable_cg

253475c

Merge branch 'main' into shiyang/breakable_cg

22ed7cd

Merge branch 'main' into shiyang/breakable_cg

6ccca77

github-actions Bot added the documentation Improvements or additions to documentation label Apr 10, 2026

ch-wan merged commit f855a0b into main Apr 11, 2026
159 of 176 checks passed

ch-wan deleted the shiyang/breakable_cg branch April 11, 2026 07:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduce CUDA graph debug mode with breakable CUDA graph#19102

Introduce CUDA graph debug mode with breakable CUDA graph#19102
ch-wan merged 8 commits intomainfrom
shiyang/breakable_cg

cctry commented Feb 21, 2026

Uh oh!

gemini-code-assist Bot commented Feb 21, 2026

Uh oh!

cctry commented Feb 21, 2026

Uh oh!

ch-wan left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

BBuf commented Mar 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

cctry commented Feb 21, 2026

CUDA graph debug mode

Breakable CUDA graph

Uh oh!

gemini-code-assist Bot commented Feb 21, 2026

Uh oh!

cctry commented Feb 21, 2026

Uh oh!

ch-wan left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

BBuf commented Mar 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants