Skip to content

fix: resolve fp8 for mixtral#1290

Merged
zhyncs merged 2 commits intomainfrom
fix
Sep 1, 2024
Merged

fix: resolve fp8 for mixtral#1290
zhyncs merged 2 commits intomainfrom
fix

Conversation

@zhyncs
Copy link
Copy Markdown
Collaborator

@zhyncs zhyncs commented Sep 1, 2024

Motivation

python3 -m sglang.launch_server --model neuralmagic/Mixtral-8x22B-Instruct-v0.1-FP8 --quantization fp8 --kv-cache-dtype fp8_e5m2 --disable-radix --tp 8 --trust-remote-code

Modifications

Checklist

  • Format your code according to the Contributor Guide.
  • Add unit tests as outlined in the Contributor Guide.
  • Update documentation as needed, including docstrings or example tutorials.

@zhyncs zhyncs merged commit 9b08052 into main Sep 1, 2024
@zhyncs zhyncs deleted the fix branch September 1, 2024 14:29
@zhyncs
Copy link
Copy Markdown
Collaborator Author

zhyncs commented Sep 1, 2024

ref

#1252
#1276

timethink pushed a commit to timethink/sglang that referenced this pull request Mar 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant