FIX: Fixes unexpected behaviour for Llava / LLama & AWQ Fused modules + revert #30070 at the same time by younesbelkada · Pull Request #30317 · huggingface/transformers

younesbelkada · 2024-04-18T12:43:09Z

What does this PR do?

Fixes a silent behaviour introduced by a recent PR, passing a None attention mask results in unexpected behaviour for awq fused modules. The fix is simply to force set a dummy _attn_implementation on the config objects of the modules that contain fused modules

I can confirm the failing slow tests now pass with these changes

cc @ArthurZucker @fxmarty

src/transformers/integrations/awq.py

HuggingFaceDocBuilderDev · 2024-04-18T13:13:56Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

src/transformers/modeling_attn_mask_utils.py

ArthurZucker

Let's make sure to run the slow Llama tests at least before merging to iterate fast if they fail

ArthurZucker · 2024-04-18T13:28:32Z

src/transformers/integrations/awq.py

+    # For AWQ fused + Llama we need to set `config._attn_implementation` = "custom" to avoid unexpected behavior and pass
+    # `None` attention mask to the fused attention modules as now the attention mask is dropped by our models and dealt
+    # by the `AttentionMaskConverter` module.


Suggested change

# For AWQ fused + Llama we need to set `config._attn_implementation` = "custom" to avoid unexpected behavior and pass

# `None` attention mask to the fused attention modules as now the attention mask is dropped by our models and dealt

# by the `AttentionMaskConverter` module.

# For AWQ fused + Llama we need to set `config._attn_implementation` = "custom" to avoid unexpected behaviors. We loop over the layers to make sure the vision/text config are

# modified only if some of their modules were fused

nit but more understandable

fxmarty · 2024-04-18T13:39:06Z

src/transformers/modeling_attn_mask_utils.py

+            if sliding_window is None or key_value_length < sliding_window:
+                ignore_causal_mask = not is_tracing


pls apply this fix here: https://github.com/huggingface/transformers/pull/30311/files

fxmarty

Please apply the above fix and LGTM

If you can run

from transformers import WhisperForCausalLM, WhisperForConditionalGeneration, WhisperProcessor
import torch
from datasets import load_dataset

processor = WhisperProcessor.from_pretrained("openai/whisper-large-v2")
model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-large-v2")

assistant_model = WhisperForCausalLM.from_pretrained("distil-whisper/distil-large-v2")

ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
sample = ds[0]["audio"]
input_features = processor(
    sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt"
).input_features

predicted_ids = model.generate(input_features, assistant_model=assistant_model)

# decode token ids to text
transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)[0]
print(transcription)

& llava AWQ test & llama slow test to be sure as well

younesbelkada · 2024-04-18T13:49:20Z

Thanks @fxmarty @ArthurZucker ! just tested everything seems to all pass !

… + revert #30070 at the same time (#30317) * Update awq.py * style * revert felix PR * fix * add felix comments

Update awq.py

784c377

younesbelkada requested review from ArthurZucker and fxmarty April 18, 2024 12:43

style

6a6c001

ArthurZucker reviewed Apr 18, 2024

View reviewed changes

src/transformers/integrations/awq.py Show resolved Hide resolved

src/transformers/integrations/awq.py Show resolved Hide resolved

revert felix PR

036d445

fxmarty reviewed Apr 18, 2024

View reviewed changes

src/transformers/modeling_attn_mask_utils.py Outdated Show resolved Hide resolved

fxmarty reviewed Apr 18, 2024

View reviewed changes

src/transformers/modeling_attn_mask_utils.py Outdated Show resolved Hide resolved

fix

4b46f5c

younesbelkada changed the title ~~FIX: Fixes unexpected behaviour for Llava / LLama & AWQ Fused modules~~ FIX: Fixes unexpected behaviour for Llava / LLama & AWQ Fused modules + revert #30070 at the same time Apr 18, 2024

younesbelkada requested review from ArthurZucker and fxmarty April 18, 2024 13:26

ArthurZucker approved these changes Apr 18, 2024

View reviewed changes

fxmarty reviewed Apr 18, 2024

View reviewed changes

fxmarty approved these changes Apr 18, 2024

View reviewed changes

add felix comments

c7681e6

LysandreJik merged commit 5728b5a into main Apr 18, 2024

LysandreJik deleted the younesbelkada-patch-1 branch April 18, 2024 13:51

poedator mentioned this pull request Apr 19, 2024

Llama: fix custom 4D masks, v2 #30348

Merged

ydshieh pushed a commit that referenced this pull request Apr 23, 2024

FIX: Fixes unexpected behaviour for Llava / LLama & AWQ Fused modules…

bf0fe25

… + revert #30070 at the same time (#30317) * Update awq.py * style * revert felix PR * fix * add felix comments

gante mentioned this pull request Apr 23, 2024

Llama: SDPA FA2 path + static cache fix #30437

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FIX: Fixes unexpected behaviour for Llava / LLama & AWQ Fused modules + revert #30070 at the same time#30317

FIX: Fixes unexpected behaviour for Llava / LLama & AWQ Fused modules + revert #30070 at the same time#30317
LysandreJik merged 5 commits intomainfrom
younesbelkada-patch-1

younesbelkada commented Apr 18, 2024 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

HuggingFaceDocBuilderDev commented Apr 18, 2024

Uh oh!

Uh oh!

Uh oh!

ArthurZucker left a comment

Uh oh!

ArthurZucker Apr 18, 2024

Uh oh!

ArthurZucker Apr 18, 2024

Uh oh!

fxmarty Apr 18, 2024

Uh oh!

fxmarty left a comment •

edited

Loading

Uh oh!

younesbelkada commented Apr 18, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

		if sliding_window is None or key_value_length < sliding_window:
		ignore_causal_mask = not is_tracing

Conversation

younesbelkada commented Apr 18, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Uh oh!

Uh oh!

Uh oh!

HuggingFaceDocBuilderDev commented Apr 18, 2024

Uh oh!

Uh oh!

Uh oh!

ArthurZucker left a comment

Choose a reason for hiding this comment

Uh oh!

ArthurZucker Apr 18, 2024

Choose a reason for hiding this comment

Uh oh!

ArthurZucker Apr 18, 2024

Choose a reason for hiding this comment

Uh oh!

fxmarty Apr 18, 2024

Choose a reason for hiding this comment

Uh oh!

fxmarty left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

younesbelkada commented Apr 18, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

younesbelkada commented Apr 18, 2024 •

edited

Loading

fxmarty left a comment •

edited

Loading