Skip to content

[Fix] Reduce memory usage for loading llava model & Remove EntryClassRemapping#1308

Merged
merrymercy merged 17 commits intomainfrom
fix-llava
Sep 3, 2024
Merged

[Fix] Reduce memory usage for loading llava model & Remove EntryClassRemapping#1308
merrymercy merged 17 commits intomainfrom
fix-llava

Conversation

@merrymercy
Copy link
Copy Markdown
Contributor

@merrymercy merrymercy commented Sep 3, 2024

  • Reduce the CPU memory usage when loading llava model
  • Rename python/sglang/srt/models/llama2.py -> python/sglang/srt/models/llama.py
  • Remove EntryClassRemappling. We can favor explicit inherence.
  • Improve docs

@merrymercy merrymercy changed the title Misc fixes: Reduce the CPU memory usage when loading llava model [Fix] Reduce the CPU memory usage when loading llava model Sep 3, 2024
@merrymercy merrymercy changed the title [Fix] Reduce the CPU memory usage when loading llava model [Fix] Reduce memory usage for loading llava model & Remove EntryClassRemappling Sep 3, 2024
@merrymercy merrymercy changed the title [Fix] Reduce memory usage for loading llava model & Remove EntryClassRemappling [Fix] Reduce memory usage for loading llava model & Remove EntryClassRemapping Sep 3, 2024
@merrymercy merrymercy merged commit f64eae3 into main Sep 3, 2024
@merrymercy merrymercy deleted the fix-llava branch September 3, 2024 04:44
timethink pushed a commit to timethink/sglang that referenced this pull request Mar 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant