Set csgmv as default lora backend. by lifuhuang · Pull Request #11488 · sgl-project/sglang

lifuhuang · 2025-10-12T04:00:43Z

Motivation

Modifications

Accuracy Tests

Benchmarking and Profiling

Checklist

Format your code according to the Format code with pre-commit.
Add unit tests according to the Run and add unit tests.
Update documentation according to Write documentations.
Provide accuracy and speed benchmark results according to Test the accuracy and Benchmark the speed.

Summary by CodeRabbit

New Features
- Added support for the “csgmv” LoRA backend and made it the default for server launches.
- Expanded selectable LoRA backends to include “triton” and “csgmv”.
Refactor
- Made the LoRA backend parameter optional across runners and sessions, allowing omission to use defaults.
Tests
- Updated LoRA test utilities and scenarios to work with optional backend selection and the new default.
- Simplified test invocations by removing hard-coded backend arguments.
Chores
- Aligned configuration defaults to the new backend behavior.

gemini-code-assist · 2025-10-12T04:01:02Z

Summary of Changes

Hello @lifuhuang, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request updates the system's default LoRA backend to 'csgmv', moving away from 'triton'. This change is reflected in the server's configuration and command-line arguments. Additionally, the test infrastructure has been adjusted to accommodate this new default, ensuring that tests either implicitly use 'csgmv' or allow for flexible backend selection.

Highlights

Default LoRA Backend Change: The default LoRA backend has been switched from 'triton' to 'csgmv' in both the server launch arguments and the core server configuration.
Test Suite Refactoring: Numerous LoRA-related test files have been updated to remove explicit 'triton' backend specifications, allowing them to utilize the newly set default 'csgmv' backend or to accept an optional backend parameter.
Backend Option Expansion: The list of available LoRA backends for testing purposes has been expanded to include 'csgmv' alongside 'triton'.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

coderabbitai · 2025-10-12T04:01:04Z

Walkthrough

Default LoRA backend changed from "triton" to "csgmv" in server defaults and CLI. Runner and test utilities now accept an optional lora_backend (None). Tests stop passing explicit backends and adjust helpers; backend list expanded to include "csgmv". One test retains a print referencing a removed backend variable.

Changes

Cohort / File(s)	Summary
LoRA backend default update (server/CLI) `benchmark/lora/launch_server.py`, `python/sglang/srt/server_args.py`	Set default lora backend from "triton" to "csgmv" in CLI arg and ServerArgs dataclass; no control-flow changes.
Runner and test session signatures `python/sglang/test/runners.py`, `test/srt/lora/test_lora_update.py`	Make lora_backend Optional[str] with default None in SRTRunner and LoRAUpdateTestSessionBase constructors; types updated, behavior delegates to defaults.
Test utility updates `test/srt/lora/utils.py`	Add Optional import; expand BACKENDS to ["triton", "csgmv"]; make backend parameter Optional with default None in run_lora_test_one_by_one.
Tests: remove explicit backend arguments `test/srt/lora/test_lora.py`, `test/srt/lora/test_lora_cuda_graph.py`, `test/srt/lora/test_lora_eviction.py`, `test/srt/lora/test_lora_radix_cache.py`, `test/srt/lora/test_lora_tp.py`	Drop local backend="triton" variables and lora_backend/backend arguments in test invocations; rely on defaults; adjust log/strings accordingly.
Test: leftover reference `test/srt/lora/test_lora_qwen3.py`	Removed backend variable and argument; print statement still references backend (now undefined).

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

A rabbit taps the keys with glee,
“From triton tides to csgmv!”
Tests hop light, no flags in tow,
Defaults guide the streams that flow.
One stray print—oops, where’d you be?
Carrot commits: consistency. 🥕✨

Pre-merge checks and finishing touches

❌ Failed checks (2 warnings)

Check name	Status	Explanation	Resolution
Description Check	⚠️ Warning	The pull request description only includes the template headings without any substantive content under Motivation, Modifications, Accuracy Tests, or Benchmarking and Profiling, and none of the checklist items have been completed.	Please provide detailed descriptions for the Motivation and Modifications sections, include any accuracy test results or benchmarks if applicable, and check off the completed items in the checklist.
Docstring Coverage	⚠️ Warning	Docstring coverage is 25.00% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (1 passed)

Check name	Status	Explanation
Title Check	✅ Passed	The title succinctly describes the primary change of updating the default LoRA backend to "csgmv" and directly reflects the modifications made in the pull request.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch lifu/default-lora-backend

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4ac8e09 and 353e5534e5e8dc6e5b12bb01f91c6d41d880490e.

📒 Files selected for processing (11)

benchmark/lora/launch_server.py (1 hunks)
python/sglang/srt/server_args.py (1 hunks)
python/sglang/test/runners.py (1 hunks)
test/srt/lora/test_lora.py (2 hunks)
test/srt/lora/test_lora_cuda_graph.py (0 hunks)
test/srt/lora/test_lora_eviction.py (0 hunks)
test/srt/lora/test_lora_qwen3.py (0 hunks)
test/srt/lora/test_lora_radix_cache.py (0 hunks)
test/srt/lora/test_lora_tp.py (0 hunks)
test/srt/lora/test_lora_update.py (1 hunks)
test/srt/lora/utils.py (3 hunks)

💤 Files with no reviewable changes (5)

test/srt/lora/test_lora_cuda_graph.py
test/srt/lora/test_lora_radix_cache.py
test/srt/lora/test_lora_tp.py
test/srt/lora/test_lora_eviction.py
test/srt/lora/test_lora_qwen3.py

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (21)

GitHub Check: unit-test-backend-1-gpu-amd (linux-mi325-gpu-1, 9)
GitHub Check: unit-test-backend-1-gpu-amd (linux-mi325-gpu-1, 8)
GitHub Check: unit-test-backend-1-gpu-amd (linux-mi325-gpu-1, 11)
GitHub Check: unit-test-backend-1-gpu-amd (linux-mi325-gpu-1, 4)
GitHub Check: unit-test-backend-1-gpu-amd (linux-mi325-gpu-1, 10)
GitHub Check: unit-test-backend-1-gpu-amd (linux-mi325-gpu-1, 7)
GitHub Check: unit-test-backend-1-gpu-amd (linux-mi325-gpu-1, 6)
GitHub Check: unit-test-backend-1-gpu-amd (linux-mi325-gpu-1, 5)
GitHub Check: unit-test-backend-1-gpu-amd (linux-mi325-gpu-1, 3)
GitHub Check: unit-test-backend-1-gpu-amd (linux-mi325-gpu-1, 2)
GitHub Check: unit-test-backend-1-gpu-amd (linux-mi325-gpu-1, 0)
GitHub Check: unit-test-backend-1-gpu-amd (linux-mi325-gpu-1, 1)
GitHub Check: bench-test-2-gpu-amd (linux-mi325-gpu-2)
GitHub Check: unit-test-backend-2-gpu-amd (linux-mi325-gpu-2)
GitHub Check: performance-test-1-gpu-part-2-amd (linux-mi325-gpu-1)
GitHub Check: performance-test-1-gpu-part-1-amd (linux-mi325-gpu-1)
GitHub Check: unit-test-sgl-kernel-amd (linux-mi325-gpu-1)
GitHub Check: accuracy-test-2-gpu-amd (linux-mi325-gpu-2)
GitHub Check: accuracy-test-1-gpu-amd (linux-mi325-gpu-1)
GitHub Check: mla-test-1-gpu-amd (linux-mi325-gpu-1)
GitHub Check: lint

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

gemini-code-assist

Code Review

This pull request updates the default LoRA backend from 'triton' to 'csgmv'. The changes are applied to server arguments, benchmarks, and test runners. Most tests are updated to remove the hardcoded 'triton' backend and rely on the new default.

I've found a critical issue in test/srt/lora/test_lora_qwen3.py where removing a variable without removing its usages will lead to a NameError, breaking the test.

Additionally, I've pointed out that with these changes, the 'triton' backend is no longer being tested. I've asked for clarification on whether this is intended and suggested either removing unused code or updating tests to cover both backends if needed.

Overall, the changes are aligned with the PR's goal, but the identified issues should be addressed.

gemini-code-assist · 2025-10-12T04:02:42Z

@@ -50,7 +50,7 @@ def __post_init__(self):


 TORCH_DTYPES = [torch.float16]
-BACKENDS = ["triton"]
+BACKENDS = ["triton", "csgmv"]


This BACKENDS constant seems to be unused across the test files. With the removal of the hardcoded backend variable in the tests, they will now only run with the new default LoRA backend (csgmv). This means testing for the 'triton' backend is effectively removed.

If this is intended, this constant could be removed to avoid confusion. If both backends should be tested, the tests should be updated to iterate over this BACKENDS list.

Fridge003

LGTM

zhyncs · 2025-10-16T20:19:07Z

@lifuhuang @Fridge003 this pr breaks lora unit tests on AMD
https://github.com/sgl-project/sglang/actions/runs/18546867193/job/52866524811

May you fix this in the follow up? Thanks!

This reverts commit b0d20cd.

Set csgmv as default lora backend.

83b5826

sglang-bot added the run-ci label Oct 12, 2025

gemini-code-assist Bot reviewed Oct 12, 2025

View reviewed changes

Fix.

269174b

lifuhuang force-pushed the lifu/default-lora-backend branch from 353e553 to 269174b Compare October 12, 2025 07:11

Minor

91fb881

lifuhuang force-pushed the lifu/default-lora-backend branch from b0c6c74 to 91fb881 Compare October 12, 2025 08:42

lifuhuang added 2 commits October 12, 2025 22:46

Fix

040a7ad

Test.

5bf20db

lifuhuang requested review from Fridge003 and Ying1123 as code owners October 12, 2025 23:51

Fridge003 self-assigned this Oct 13, 2025

Fix

f3f4986

lifuhuang mentioned this pull request Oct 14, 2025

[Fix] Fix accuracy bug in CSGMV kernel caching key. #11579

Merged

4 tasks

Merge branch 'main' into lifu/default-lora-backend

16cf7f0

lifuhuang changed the title ~~[WIP] Set csgmv as default lora backend.~~ Set csgmv as default lora backend. Oct 16, 2025

Merge branch 'main' into lifu/default-lora-backend

4c621f0

Fridge003 approved these changes Oct 16, 2025

View reviewed changes

Fridge003 merged commit b0d20cd into main Oct 16, 2025
122 of 134 checks passed

Fridge003 deleted the lifu/default-lora-backend branch October 16, 2025 04:53

zhyncs mentioned this pull request Oct 16, 2025

model: qwen3-omni (thinker-only) #10911

Merged

4 tasks

zhyncs added a commit that referenced this pull request Oct 16, 2025

Revert "Set csgmv as default lora backend. (#11488)"

12245b0

This reverts commit b0d20cd.

zhyncs mentioned this pull request Oct 16, 2025

Revert "Set csgmv as default lora backend. (#11488)" #11735

Merged

4 tasks

Fridge003 pushed a commit that referenced this pull request Oct 17, 2025

Revert "Set csgmv as default lora backend. (#11488)" (#11735)

da681f3

lifuhuang added a commit that referenced this pull request Oct 26, 2025

Redo "Set csgmv as default lora backend. (#11488)"

9cd914e

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Set csgmv as default lora backend.#11488

Set csgmv as default lora backend.#11488
Fridge003 merged 8 commits intomainfrom
lifu/default-lora-backend

lifuhuang commented Oct 12, 2025 •

edited by coderabbitai Bot

Loading

Uh oh!

gemini-code-assist Bot commented Oct 12, 2025

Uh oh!

coderabbitai Bot commented Oct 12, 2025 •

edited

Loading

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

gemini-code-assist Bot Oct 12, 2025

Uh oh!

Fridge003 left a comment

Uh oh!

Uh oh!

zhyncs commented Oct 16, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

lifuhuang commented Oct 12, 2025 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Modifications

Accuracy Tests

Benchmarking and Profiling

Checklist

Summary by CodeRabbit

Uh oh!

gemini-code-assist Bot commented Oct 12, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

coderabbitai Bot commented Oct 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

Pre-merge checks and finishing touches

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

gemini-code-assist Bot Oct 12, 2025

Choose a reason for hiding this comment

Uh oh!

Fridge003 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

zhyncs commented Oct 16, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

lifuhuang commented Oct 12, 2025 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Oct 12, 2025 •

edited

Loading