[`ProcessingIdefics`] Attention mask bug with padding by byi8220 · Pull Request #29449 · huggingface/transformers

byi8220 · 2024-03-05T04:48:47Z

What does this PR do?

This PR does the following:

Default the value of IdeficsProcessor.__call__()s padding param to 'longest' instead of False.
Remove the hardcoded text padding logic in that same function, and fully trust the tokenizer will correctly pad.
Add assertions for correctness of attention masks in unit test IdeficsProcessorTest.test_tokenizer_padding
Created a IdeficsProcessorTest.test_tokenizer_left_padding which is a copy of IdeficsProcessorTest except using default tokenizer (which left pads). This could have been parameterized but it felt fine to just duplicate it.

Context

IIUC, it seems like this code is currently always hand-performing an additional padding pass after the tokenizer performs it's padding, according to this code: https://github.com/huggingface/transformers/blob/main/src/transformers/models/idefics/processing_idefics.py#L348-L354

And since the output tensors are stacked together in this code, they all have to be of the same size, i.e. padded or otherwise normalized.

Note: Since this changes the default padding, now anyone who calls this function with an explicit value of padding=False will likely encounter an error.

Testing

Unit tests ran with pytest ./tests/models/idefics/ pass. (129 passed, 115 skipped, 8 warnings in 19.22s)

Did not run the slow tests since, running all tests with RUN_SLOW=yes enabled crashes my workstation.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@ArthurZucker
@amyeroberts

ArthurZucker · 2024-03-07T00:20:57Z

Thanks for the PR! Reviewing asap

amyeroberts

Hi @bri25yu, thanks for working on this!

Making the padding logic be consistent within the processor is a good idea, but I don't think we should change the default behaviour.

amyeroberts · 2024-03-08T18:38:39Z

src/transformers/models/idefics/processing_idefics.py

        self,
        prompts: Union[List[TextInput], List[List[TextInput]]],
-        padding: Union[bool, str, PaddingStrategy] = False,
+        padding: Union[bool, str, PaddingStrategy] = "longest",


I don't think we should change the default here for two reasons:

It doesn't match the default behaviour for most processing classes

It changes the default behaviour, which can be considered a breaking change

bri25yu

You may have tagged the wrong person 🙃

It doesn't match the default behaviour for most processing classes

Does the idefics model support non-padded inputs? From my understanding of the original issue, it seems they desire expect some padding even when the argument is not passed.

It changes the default behaviour, which can be considered a breaking change

I'm not sure if this is a bug, but the default behavior appears to have been inaccurate to begin with. Even if the user passes in padding=False, lines 347-354 seems to be forcibly padding the text input to the maximum sequence length anyways.

You may have tagged the wrong person 🙃

Woops, yes, sorry about that

Does the idefics model support non-padded inputs? From my understanding of the original issue, it seems they desire expect some padding even when the argument is not passed.

No, but no models support non-padded inputs if batch_size > 1 and the input sequences are of different lengths, but all processors and tokenizers do not pad the inputs by default.

I'm not sure if this is a bug, but the default behavior appears to have been inaccurate to begin with. Even if the user passes in padding=False, lines 347-354 seems to be forcibly padding the text input to the maximum sequence length anyways.

In this case, the forcible padding when padding=False should be removed

In this case, the forcible padding when padding=False should be removed

Yes, this behavior was removed as part of this PR. After this PR, padding=False or not setting padding will not pad the input.

I have modified the PR to default to padding=False, and the unit tests (and one of the integration tests) to explicitly specify padding='longest'. I guess my only concern was that by changing this behavior, a user which was relying on the default behavior would find their code broken overnight.

I guess my only concern was that by changing this behavior, a user which was relying on the default behavior would find their code broken overnight.

@byi8220 Hmm, yes, this is tricky and that's a good point. OK, in this case, I think your original solution of setting padding='longest' as default is best, ideally with a comment linking to this issue to explain why the default is different and adding a description for the False option in the docstring

…byi8220/transformers into attention-mask-bug-with-padding

amyeroberts

Thanks for working on this and adding tests!

byi8220 added 2 commits March 4, 2024 23:25

Defaulted IdeficsProcessor padding to 'longest', removed manual padding

43119af

make fixup

8326681

byi8220 mentioned this pull request Mar 5, 2024

Idefics - AttentionMasks wrongly set with padding='longest' #28591

Closed

amyeroberts reviewed Mar 8, 2024

View reviewed changes

byi8220 added 9 commits March 11, 2024 09:04

Defaulted processor call to padding=False

6728770

Add padding to processor call in IdeficsModelIntegrationTest as well

bebd7c1

Defaulted IdeficsProcessor padding to 'longest', removed manual padding

8d8cc2f

make fixup

26571b2

Defaulted processor call to padding=False

72d9e00

Add padding to processor call in IdeficsModelIntegrationTest as well

8675211

Merge branch 'huggingface:main' into attention-mask-bug-with-padding

9014e8b

Merge branch 'huggingface:main' into attention-mask-bug-with-padding

1be6fff

Merge branch 'attention-mask-bug-with-padding' of https://github.com/…

ba647ea

…byi8220/transformers into attention-mask-bug-with-padding

ArthurZucker changed the title ~~Attention mask bug with padding~~ [ProcessingIdefics] Attention mask bug with padding Mar 19, 2024

byi8220 added 5 commits March 19, 2024 12:24

redefaulted padding=longest again

5e27c2c

fixup/doc

a597d6d

Merge branch 'huggingface:main' into attention-mask-bug-with-padding

8a62722

Merge branch 'huggingface:main' into attention-mask-bug-with-padding

ba2f2da

Merge branch 'huggingface:main' into attention-mask-bug-with-padding

f79ea7f

amyeroberts approved these changes Apr 4, 2024

View reviewed changes

amyeroberts merged commit 75b76a5 into huggingface:main Apr 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[`ProcessingIdefics`] Attention mask bug with padding#29449

[`ProcessingIdefics`] Attention mask bug with padding#29449
amyeroberts merged 16 commits intohuggingface:mainfrom
byi8220:attention-mask-bug-with-padding

byi8220 commented Mar 5, 2024

Uh oh!

ArthurZucker commented Mar 7, 2024

Uh oh!

amyeroberts left a comment

Uh oh!

amyeroberts Mar 8, 2024

Uh oh!

byi8220 Mar 8, 2024

Uh oh!

amyeroberts Mar 11, 2024

Uh oh!

byi8220 Mar 11, 2024 •

edited

Loading

Uh oh!

amyeroberts Mar 19, 2024

Uh oh!

byi8220 Mar 19, 2024

Uh oh!

amyeroberts left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

byi8220 commented Mar 5, 2024

What does this PR do?

Context

Testing

Before submitting

Who can review?

Uh oh!

ArthurZucker commented Mar 7, 2024

Uh oh!

amyeroberts left a comment

Choose a reason for hiding this comment

Uh oh!

amyeroberts Mar 8, 2024

Choose a reason for hiding this comment

Uh oh!

byi8220 Mar 8, 2024

Choose a reason for hiding this comment

Uh oh!

amyeroberts Mar 11, 2024

Choose a reason for hiding this comment

Uh oh!

byi8220 Mar 11, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

amyeroberts Mar 19, 2024

Choose a reason for hiding this comment

Uh oh!

byi8220 Mar 19, 2024

Choose a reason for hiding this comment

Uh oh!

amyeroberts left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

byi8220 Mar 11, 2024 •

edited

Loading