Skip to content

Fix llama + gemma accelete tests#29380

Merged
SunMarc merged 1 commit intohuggingface:mainfrom
SunMarc:fix-accelerate-test-llama-gemma
Mar 1, 2024
Merged

Fix llama + gemma accelete tests#29380
SunMarc merged 1 commit intohuggingface:mainfrom
SunMarc:fix-accelerate-test-llama-gemma

Conversation

@SunMarc
Copy link
Member

@SunMarc SunMarc commented Feb 29, 2024

What does this PR do ?

This PR fixes the accelerate test for llama and gemma (test_cpu_offload).
The problem was that the causal_mask buffer was big compared to the small testing model. This resulted in edge cases where the model wasn't split since the model_size was too big to start with and auto_infer_device_map only allocates the parameters and not the buffers (supposed to be small).

Related PR to fix these tests.

Tests passing:

pytest -m accelerate_tests tests/models/gemma/test_modeling_gemma.py::GemmaModelTest -s -vvvv
pytest -m accelerate_tests tests/models/llama/test_modeling_llama.py::LlamaModelTest -s -vvvv

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Collaborator

@ArthurZucker ArthurZucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yessss! 🧼 Thanks you

@SunMarc SunMarc merged commit cec7733 into huggingface:main Mar 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants