Skip to content

Deepseek v2 support#693

Merged
hnyls2002 merged 6 commits intomainfrom
deepseek
Jul 27, 2024
Merged

Deepseek v2 support#693
hnyls2002 merged 6 commits intomainfrom
deepseek

Conversation

@hnyls2002
Copy link
Copy Markdown
Collaborator

@hnyls2002 hnyls2002 commented Jul 21, 2024

To use deepseek v2, please sepcify the --context-length or --max-num-reqs to avoid oom. The context length for deepseek is quite large, for the current static req_to_token layout, we cannot support large requests num and large context length at the same time.

@hnyls2002 hnyls2002 marked this pull request as draft July 21, 2024 22:44
@m0g1cian
Copy link
Copy Markdown

Looking forward to see Deepseek v2 gets supported! I was trying to do the same thing two weeks ago but found the exact same issue of

RuntimeError: shape mismatch: value tensor of shape [7, 16, 256] cannot be broadcast to indexing result of shape [7, 16, 40]

@hnyls2002 hnyls2002 marked this pull request as ready for review July 26, 2024 23:31
@hnyls2002 hnyls2002 merged commit 679ebcb into main Jul 27, 2024
@hnyls2002 hnyls2002 deleted the deepseek branch July 27, 2024 00:10
@Xu-Chen
Copy link
Copy Markdown
Contributor

Xu-Chen commented Jul 27, 2024

Thank you for your excellent work. Will you support MLA in the future? Reduce KV cache to support larger context length.

@Ying1123 Ying1123 mentioned this pull request Aug 2, 2024
29 tasks
timethink pushed a commit to timethink/sglang that referenced this pull request Mar 9, 2025
cen121212 pushed a commit to cen121212/sglang that referenced this pull request Nov 10, 2025
* Update test_utils.py

* Update test_utils.py

* Update test_utils.py

* Update run_suite.py

* Update run_suite.py

* Uncomment test_original_logprobs.py in test suite

* Update pr-test-npu-debug.yml

* Fix variable name from exit_code to ret_code

* Update run_suite.py

* Update run_suite.py

* Update daily-build-test-npu-innersource.yml
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants