Add SGLang CUDA crash API logging inspired by FlashInfer by BBuf · Pull Request #20910 · sgl-project/sglang

BBuf · 2026-03-19T05:39:12Z

Motivation

This PR adds SGLang-native API-level CUDA crash logging for LLM and diffusion kernel call boundaries.

The implementation is inspired by FlashInfer's API logging utility:
https://github.com/flashinfer-ai/flashinfer/blob/main/flashinfer/api_logging.py

This version keeps the scope focused on crash debugging and level-10 dump capture. Replay-related code was intentionally not included so the implementation stays smaller and aligned with the actual SGLang debugging workflow.

Also add a skill to debug cuda crash like flashinfer.

Modifications

Accuracy Tests

Benchmarking and Profiling

Checklist

Format your code according to the Format code with pre-commit.
Add unit tests according to the Run and add unit tests.
Update documentation according to Write documentations.
Provide accuracy and speed benchmark results according to Test the accuracy and Benchmark the speed.
Follow the SGLang code style guidance.

Review Process

Ping Merge Oncalls to start the PR flow. See the PR Merge Process.
Get approvals from CODEOWNERS and other reviewers.
Trigger CI tests with comments or contact authorized users to do so.
- /tag-run-ci-label, /rerun-failed-ci, /tag-and-rerun-ci
After green CI and required approvals, ask Merge Oncalls to merge.

gemini-code-assist · 2026-03-19T05:39:16Z

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

BBuf · 2026-03-19T05:56:54Z

/tag-and-rerun-ci

BBuf · 2026-03-20T06:54:35Z

/tag-and-rerun-ci

merrymercy · 2026-03-20T07:29:12Z

@@ -0,0 +1,657 @@
+---
+name: debug-cuda-crash
+description: Tutorial for debugging CUDA crashes in SGLang using kernel API logging


This should be telling when to call the skill.
e.g.

description: Call this skill when you need debug CUDA crashes using kernel API logging.

Done in 42beeb2

merrymercy · 2026-03-20T07:31:37Z

+    ):
+        return norm_infer_native(x, weight, bias, eps, is_rms_norm, out)
+
+    @maybe_wrap_jit_kernel_debug(op_name="jit_kernel.diffusion.triton.rms_norm_fn")


why do we need to manually give it a name? It should be able to auto infer the op name

merrymercy · 2026-03-20T07:31:59Z



+@maybe_wrap_jit_kernel_debug(
+    op_name="jit_kernel.diffusion.triton.fuse_residual_layernorm_scale_shift_gate_select01_kernel"


this is too tedious. The decorator should auto infer the name

merrymercy · 2026-03-20T07:34:09Z

    tl.store(out2_row + offsets2, out2, mask=mask2)


+@debug_kernel_api(op_name="MiniMaxM2.rms_sumsq_serial")


it should auto infer.
The decorator can look at the file name and function name.

BBuf · 2026-03-21T08:23:51Z

/tag-and-rerun-ci

…#20910)

The patched __init__.py imports maybe_wrap_debug_kernel from sgl_kernel.debug_utils for the SGLANG_KERNEL_API_LOGLEVEL machinery. This file exists in upstream sgl-kernel feature branches (PR sgl-project#20910) but never landed on the kvcache-ai fork's main. The old PyPI-installed sglang-kt 0.6.1 image happened to bundle it; our source build does not. Drop a verbatim copy from PR sgl-project#20910 (BBuf/Xiaoyu Zhang) into the in-tree sgl-kernel source so the wheel we build packages it. Without this file the source-built sgl-kernel raises ModuleNotFoundError at import time.

ud

a7ff45c

github-actions Bot added documentation Improvements or additions to documentation deepseek diffusion SGLang Diffusion labels Mar 19, 2026

github-actions Bot added the run-ci label Mar 19, 2026

BBuf added 7 commits March 19, 2026 16:32

ud

a57db9d

ud

cc96de8

ud

96849ab

ud

9d37e74

ud

aa9bfff

ud

8b6a1b0

ud

89a6f44

DarkSharpness reviewed Mar 19, 2026

View reviewed changes

Comment thread python/sglang/jit_kernel/debug_utils.py Outdated

BBuf added 2 commits March 19, 2026 17:29

fix ziyi comment

a3af453

ud

e884992

merrymercy requested changes Mar 20, 2026

View reviewed changes

ud

35b2d3f

merrymercy requested changes Mar 20, 2026

View reviewed changes

refactor: auto infer debug kernel names

42beeb2

BBuf mentioned this pull request Mar 20, 2026

Introduce CUDA graph debug mode with breakable CUDA graph #19102

Merged

BBuf and others added 3 commits March 21, 2026 10:38

Merge branch 'main' into add_sglang_cuda_crash_debug_2

16fc849

ud

4a81024

Merge branch 'main' into add_sglang_cuda_crash_debug_2

f9e5721

Merge branch 'main' into add_sglang_cuda_crash_debug_2

1ebcb1e

merrymercy approved these changes Mar 22, 2026

View reviewed changes

BBuf merged commit 766d225 into main Mar 22, 2026
76 of 105 checks passed

BBuf deleted the add_sglang_cuda_crash_debug_2 branch March 22, 2026 08:39

OrangeRedeng pushed a commit to OrangeRedeng/sglang that referenced this pull request Mar 22, 2026

Add SGLang CUDA crash API logging inspired by FlashInfer (sgl-project…

6c2aca3

…#20910)

0-693 pushed a commit to 0-693/sglang that referenced this pull request Mar 25, 2026

Add SGLang CUDA crash API logging inspired by FlashInfer (sgl-project…

0f74c82

…#20910)

dutsc pushed a commit to dutsc/sglang that referenced this pull request Mar 30, 2026

Add SGLang CUDA crash API logging inspired by FlashInfer (sgl-project…

d96c079

…#20910)

JustinTong0323 pushed a commit to JustinTong0323/sglang that referenced this pull request Apr 7, 2026

Add SGLang CUDA crash API logging inspired by FlashInfer (sgl-project…

7eda0d1

…#20910)

yhyang201 pushed a commit to yhyang201/sglang that referenced this pull request Apr 22, 2026

Add SGLang CUDA crash API logging inspired by FlashInfer (sgl-project…

ec580a6

…#20910)



		@maybe_wrap_jit_kernel_debug(
		op_name="jit_kernel.diffusion.triton.fuse_residual_layernorm_scale_shift_gate_select01_kernel"

		tl.store(out2_row + offsets2, out2, mask=mask2)


		@debug_kernel_api(op_name="MiniMaxM2.rms_sumsq_serial")

Conversation

BBuf commented Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Modifications

Accuracy Tests

Benchmarking and Profiling

Checklist

Review Process

Uh oh!

gemini-code-assist Bot commented Mar 19, 2026

Uh oh!

BBuf commented Mar 19, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

BBuf commented Mar 20, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

BBuf commented Mar 21, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

BBuf commented Mar 19, 2026 •

edited

Loading