Fix llama + gemma accelete tests by SunMarc · Pull Request #29380 · huggingface/transformers

SunMarc · 2024-02-29T21:20:36Z

What does this PR do ?

This PR fixes the accelerate test for llama and gemma (test_cpu_offload).
The problem was that the causal_mask buffer was big compared to the small testing model. This resulted in edge cases where the model wasn't split since the model_size was too big to start with and auto_infer_device_map only allocates the parameters and not the buffers (supposed to be small).

Related PR to fix these tests.

Tests passing:

pytest -m accelerate_tests tests/models/gemma/test_modeling_gemma.py::GemmaModelTest -s -vvvv
pytest -m accelerate_tests tests/models/llama/test_modeling_llama.py::LlamaModelTest -s -vvvv

HuggingFaceDocBuilderDev · 2024-02-29T21:41:12Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

ArthurZucker

Yessss! 🧼 Thanks you

Fix llama + gemma accelete tests

bd1e87b

SunMarc requested review from ArthurZucker and ydshieh February 29, 2024 21:20

ArthurZucker approved these changes Mar 1, 2024

View reviewed changes

SunMarc merged commit cec7733 into huggingface:main Mar 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix llama + gemma accelete tests#29380

Fix llama + gemma accelete tests#29380
SunMarc merged 1 commit intohuggingface:mainfrom
SunMarc:fix-accelerate-test-llama-gemma

SunMarc commented Feb 29, 2024

Uh oh!

HuggingFaceDocBuilderDev commented Feb 29, 2024

Uh oh!

ArthurZucker left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

SunMarc commented Feb 29, 2024

What does this PR do ?

Uh oh!

HuggingFaceDocBuilderDev commented Feb 29, 2024

Uh oh!

ArthurZucker left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants