Skip to content

HotFix for #3988 using blockwise_int8#4023

Merged
zhaochenyang20 merged 5 commits intosgl-project:mainfrom
xihuai18:hotfix_blockwise_int8_moe
Mar 4, 2025
Merged

HotFix for #3988 using blockwise_int8#4023
zhaochenyang20 merged 5 commits intosgl-project:mainfrom
xihuai18:hotfix_blockwise_int8_moe

Conversation

@xihuai18
Copy link
Copy Markdown
Contributor

@xihuai18 xihuai18 commented Mar 3, 2025

Motivation

After #3988 is merge, run int8 DeepSeek R1 or V3 will raise:

  File "/path/to/sglang/python/sglang/srt/layers/moe/fused_moe_triton/layer.py", line 603, in forward
    final_hidden_states = self.quant_method.apply(
TypeError: apply() got an unexpected keyword argument 'inplace'

Modifications

Add inplace and no_combine parameters in blockwise_int8.py.

Checklist

@xihuai18
Copy link
Copy Markdown
Contributor Author

xihuai18 commented Mar 3, 2025

@HandH1998 @laixinn Could you help review this hotfix?

@xihuai18 xihuai18 changed the title hotfix for #3988 using blockwise_int8 HotFix for #3988 using blockwise_int8 Mar 3, 2025
@xihuai18
Copy link
Copy Markdown
Contributor Author

xihuai18 commented Mar 3, 2025

@merrymercy Is this hotfix right?

@zhyncs zhyncs requested a review from HandH1998 March 3, 2025 17:14
@zhyncs zhyncs mentioned this pull request Mar 3, 2025
12 tasks
Comment thread python/sglang/srt/layers/quantization/blockwise_int8.py Outdated
@zhaochenyang20 zhaochenyang20 merged commit 12f2e6c into sgl-project:main Mar 4, 2025
@xihuai18 xihuai18 deleted the hotfix_blockwise_int8_moe branch March 11, 2025 03:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants