Skip to content

[Fix]: Refactor _build_req_from_sampling to use shallow_asdict#13782

Merged
mickqian merged 4 commits intosgl-project:mainfrom
cocoshe:fix_edits_req
Dec 19, 2025
Merged

[Fix]: Refactor _build_req_from_sampling to use shallow_asdict#13782
mickqian merged 4 commits intosgl-project:mainfrom
cocoshe:fix_edits_req

Conversation

@cocoshe
Copy link
Copy Markdown
Contributor

@cocoshe cocoshe commented Nov 23, 2025

Motivation

@dataclass
class QwenImageSamplingParams(SamplingParams):
# Video parameters
# height: int = 1024
# width: int = 1024
negative_prompt: str = " "
num_frames: int = 1
# Denoising stage
guidance_scale: float = 4.0
num_inference_steps: int = 50

When modify params in model config, for example, num_inference_steps = 28.

But when using the edits endpoint, the original _build_req_from_sampling can't deal with it.

[11-23 03:31:18] HF model config: {'attn_scales': [], 'base_dim': 96, 'dim_mult': [1, 2, 4, 4], 'dropout': 0.0, 'latents_mean': [-0.7571, -0.7089, -0.9113, 0.1075, -0.1745, 0.9653, -0.1517, 1.5508, 0.4134, -0.0715, 0.5517, -0.3632, -0.1922, -0.9497, 0.2503, -0.2921], 'latents_std': [2.8184, 1.4541, 2.3275, 2.6558, 1.2196, 1.7708, 2.6052, 2.0743, 3.2687, 2.1526, 2.8652, 1.5579, 1.6382, 1.1253, 2.8251, 1.916], 'num_res_blocks': 2, 'temperal_downsample': [False, True, True], 'z_dim': 16}
[11-23 03:31:18] Loaded module vae from /home/myw/.cache/huggingface/hub/models--Qwen--Qwen-Image-Edit/snapshots/ac7f9318f633fc4b5778c59367c8128225f1e3de/vae
Loading required modules: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████| 6/6 [00:58<00:00,  9.83s/it]
[11-23 03:31:18] Pipelines instantiated
[11-23 03:31:18] Worker 0: Initialized device, model, and distributed environment.
[11-23 03:31:18] Worker 0: Scheduler loop started.
[11-23 03:31:18] Rank 0 scheduler listening on tcp://*:5593
[11-23 03:31:18] Starting FastAPI server.
[11-23 03:31:18] Started server process [311208]
[11-23 03:31:18] Waiting for application startup.
[11-23 03:31:18] Scheduler client connected to backend scheduler at tcp://0.0.0.0:5593
[11-23 03:31:18] ZMQ Broker is listening for offline jobs on tcp://*:3001
[11-23 03:31:18] Application startup complete.
[11-23 03:31:18] Uvicorn running on http://0.0.0.0:3000 (Press CTRL+C to quit)
[11-23 03:31:30] Sampling params:
                      height: 1024
                       width: 1024
                  num_frames: 1
                      prompt: Change the people to a dog.
                  neg_prompt:  
                        seed: 1024
                 infer_steps: 28
      num_outputs_per_prompt: 1
              guidance_scale: 4.0
     embedded_guidance_scale: 6.0
                    n_tokens: 16384
                  flow_shift: None
                  image_path: outputs/uploads/4fe8eb16-648e-4b48-96e9-a72fe2e6214e_0001.jpg
                 save_output: True
            output_file_path: outputs/4fe8eb16-648e-4b48-96e9-a72fe2e6214e.jpg
        
[11-23 03:31:30] Creating pipeline stages...
[11-23 03:31:30] Using FlashAttention (FA3 for hopper, FA4 for blackwell) backend.
[11-23 03:31:30] Running pipeline stages: ['input_validation_stage', 'prompt_encoding_stage_primary', 'image_encoding_stage_primary', 'timestep_preparation_stage', 'latent_preparation_stage', 'conditioning_stage', 'denoising_stage', 'decoding_stage']
[11-23 03:31:30] [InputValidationStage] started...
[11-23 03:31:30] [InputValidationStage] finished in 0.0175 seconds
[11-23 03:31:30] [ImageEncodingStage] started...
[11-23 03:31:34] [ImageEncodingStage] finished in 3.7295 seconds
[11-23 03:31:34] [ImageVAEEncodingStage] started...
[11-23 03:31:35] [ImageVAEEncodingStage] finished in 0.6951 seconds
[11-23 03:31:35] [TimestepPreparationStage] started...
[11-23 03:31:35] [TimestepPreparationStage] finished in 0.0531 seconds
[11-23 03:31:35] [LatentPreparationStage] started...
[11-23 03:31:35] [LatentPreparationStage] finished in 0.0103 seconds
[11-23 03:31:35] [ConditioningStage] started...
[11-23 03:31:35] [ConditioningStage] finished in 0.0001 seconds
[11-23 03:31:35] [DenoisingStage] started...
num_inference_steps: 50
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 50/50 [01:40<00:00,  2.00s/it]
[11-23 03:33:15] [DenoisingStage] average time per step: 2.0017 seconds
[11-23 03:33:15] [DenoisingStage] finished in 100.1021 seconds
[11-23 03:33:15] [DecodingStage] started...
[11-23 03:33:17] [DecodingStage] finished in 1.7735 seconds
[11-23 03:33:17] Saved output to outputs/4fe8eb16-648e-4b48-96e9-a72fe2e6214e.jpg
[11-23 03:33:17] 127.0.0.1:35058 - "POST /v1/images/edits HTTP/1.1" 200

print Sampling params get the infer_steps=28, but still use the default 50 as the num steps.

Modifications

Use shallow_asdict to build Req with SamplingParams

Checklist

@cocoshe cocoshe requested a review from mickqian as a code owner November 23, 2025 04:12
@github-actions github-actions Bot added the diffusion SGLang Diffusion label Nov 23, 2025
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello @cocoshe, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses a critical bug in the image generation API where user-defined sampling parameters were not being correctly propagated to the underlying request object. The fix involves refactoring the request construction logic to dynamically pass all parameters from the SamplingParams object using shallow_asdict, ensuring that the system accurately reflects the desired configuration for image generation tasks.

Highlights

  • Bug Fix: Sampling Parameters Not Applied: Resolved an issue where custom sampling parameters, such as num_inference_steps, were not being correctly applied when using the image edits endpoint. The system was defaulting to a fixed value (e.g., 50 steps) even when a different value (e.g., 28 steps) was specified in the SamplingParams.
  • Refactor _build_req_from_sampling: The _build_req_from_sampling function has been refactored to use shallow_asdict. This change ensures that all parameters defined in the SamplingParams object are automatically passed to the Req constructor, preventing future issues with unhandled or ignored parameters.
  • Improved Parameter Handling: By leveraging shallow_asdict, the code for building requests from sampling parameters is now more robust and maintainable, as it no longer requires manual enumeration of each parameter, reducing the chance of errors when new parameters are added or existing ones are modified.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors _build_req_from_sampling to use shallow_asdict, which correctly propagates all sampling parameters to the request object. This is a good improvement for maintainability and fixes the issue of some parameters not being passed. However, the current implementation could raise a TypeError if the SamplingParams object contains fields not present in the Req dataclass. I've suggested a more robust implementation that filters the parameters to prevent this issue.

Comment thread python/sglang/multimodal_gen/runtime/entrypoints/openai/image_api.py Outdated
@mickqian
Copy link
Copy Markdown
Collaborator

You should modify SamplingParams._merge_with_user_params if you meant that. And other affected params should be considered

@cocoshe
Copy link
Copy Markdown
Contributor Author

cocoshe commented Nov 27, 2025

You should modify SamplingParams._merge_with_user_params if you meant that. And other affected params should be considered

Compare to the /generations entrypoint,

batch = prepare_request(
server_args=get_global_server_args(),
sampling_params=sampling,

def prepare_request(
server_args: ServerArgs,
sampling_params: SamplingParams,
) -> Req:
"""
Settle SamplingParams according to ServerArgs
"""
# Create a copy of inference args to avoid modifying the original
req = Req(
**shallow_asdict(sampling_params),
VSA_sparsity=server_args.VSA_sparsity,
)
req.adjust_size(server_args)
if req.width <= 0 or req.height <= 0:
raise ValueError(
f"Height, width must be positive integers, got "
f"height={req.height}, width={req.width}"
)
return req

I think it better to assign the params with shallow_asdict to build the Req instead of set the params one by one and _merge_with_user_params, that may be redundant, what do you think?

@mickqian
Copy link
Copy Markdown
Collaborator

@cocoshe Regarding that, I still think a question should be addressed first:

  1. is it that, some fields of SamplingParams are not supposed to be allowed to be modified?

@mickqian
Copy link
Copy Markdown
Collaborator

hi, could you retry?

@cocoshe
Copy link
Copy Markdown
Contributor Author

cocoshe commented Dec 15, 2025

hi, could you retry?

Sure It can be easily done by adding code like: num_inference_steps=s.num_inference_steps,
And I'm thinking if there are more hyperparams in some specific dit models, we need to add this kind of code everytime and one by one.
So why not use the shallow_asdict to build the Req, are there any risks?

@yhyang201
Copy link
Copy Markdown
Collaborator

/tag-and-rerun-ci

@yhyang201
Copy link
Copy Markdown
Collaborator

hi, could you retry?

Sure It can be easily done by adding code like: num_inference_steps=s.num_inference_steps, And I'm thinking if there are more hyperparams in some specific dit models, we need to add this kind of code everytime and one by one. So why not use the shallow_asdict to build the Req, are there any risks?

We need this. Could you please rebase? We can merge it once the CI passes.

Also, could you check if Req contains all the members of SamplingParams (for example, fps)?

@mickqian mickqian mentioned this pull request Dec 16, 2025
6 tasks
@cocoshe cocoshe requested a review from yhyang201 as a code owner December 18, 2025 12:28
@mickqian mickqian merged commit 0e869f0 into sgl-project:main Dec 19, 2025
80 of 83 checks passed
xiaobaicxy added a commit to xiaobaicxy/sglang that referenced this pull request Dec 19, 2025
* 'main' of https://github.com/sgl-project/sglang: (136 commits)
  fix: unreachable error check in retraction (sgl-project#15433)
  [sgl-kernel] chore: update deepgemm version (sgl-project#13402)
  [diffusion] multi-platform: support diffusion on amd and fix encoder loading on MI325 (sgl-project#13760)
  [amd] Add deterministic all-reduce kernel for AMD (ROCm) (sgl-project#15340)
  [diffusion] refactor: refactor _build_req_from_sampling to use shallow_asdict (sgl-project#13782)
  Add customized sampler registration (sgl-project#15423)
  Update readme (sgl-project#15425)
  Fix Mindspore model import warning (sgl-project#15287)
  [Feature] Xiaomi `MiMo-V2-Flash` day0 support (sgl-project#15207)
  [diffusion] profiling: add bench_serving.py and VBench (sgl-project#15410)
  [DLLM] Fix dLLM regression (sgl-project#15371)
  [Deepseek V3.2] Fix Deepseek MTP in V1 mode (sgl-project#15429)
  chore: update CI_PERMISSIONS (sgl-project#15431)
  [DLLM] Add CI for diffusion LLMs (sgl-project#14723)
  Support using different attention backend for draft decoding. (sgl-project#14843)
  feat(dsv32): better error handling for DeepSeek-v3.2 encoder (sgl-project#14353)
  tiny fix lint on main (sgl-project#15424)
  multimodal: precompute hash for MultimodalDataItem (sgl-project#14354)
  [AMD] Clear pre-built AITER kernels and warmup to prevent segfaults and test timeouts (sgl-project#15318)
  [Performance] optimize NSA backend metadata computation for multi-step speculative decoding (sgl-project#14781)
  ...
Prozac614 pushed a commit to Prozac614/sglang that referenced this pull request Dec 23, 2025
jiaming1130 pushed a commit to zhuyijie88/sglang that referenced this pull request Dec 25, 2025
YChange01 pushed a commit to YChange01/sglang that referenced this pull request Jan 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

diffusion SGLang Diffusion run-ci

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants