Fix some Flax models' `hidden_states` by ydshieh · Pull Request #16167 · huggingface/transformers

ydshieh · 2022-03-15T13:30:59Z

What does this PR do?

Fix some Flax models where the last element in hidden_states is different between PT/Flax version.

More context

Some models have

last_hidden_state = self.layer_norm(last_hidden_state)

In Pytorch version, the returned hidden_states have this last_hidden_state (after layer norm) as the last element.
In Flax version, the last element of the returned hidden_states is the one before the layer norm.

This PR fixes this inconsistency (by using the PyTorch logic).

HuggingFaceDocBuilderDev · 2022-03-15T13:42:17Z

The documentation is not available anymore as the PR was closed or merged.

…erNormCollection

patil-suraj

Great catch! Thanks a lot for fixing this.

Just left a nit :)

patil-suraj · 2022-03-15T14:23:35Z

src/transformers/models/blenderbot/modeling_flax_blenderbot.py

+            if not output_hidden_states:
+                return (last_hidden_states,) + outputs[1:]
+            else:
+                return (last_hidden_states, hidden_states) + outputs[2:]


(nit) Not a big fan of nested ifs, maybe simplify this a bit

patil-suraj · 2022-03-15T14:23:54Z

src/transformers/models/blenderbot/modeling_flax_blenderbot.py

+            if not output_hidden_states:
+                return (last_hidden_states,) + outputs[1:]
+            else:
+                return (last_hidden_states, hidden_states) + outputs[2:]


same comment as above

Hi @patil-suraj , is this better?

transformers/src/transformers/models/blenderbot/modeling_flax_blenderbot.py

Lines 720 to 722 in 415eb3c

if not return_dict:

outputs = (last_hidden_states, hidden_states) + (outputs[2:] if output_hidden_states else outputs[1:])

return tuple(v for v in outputs if v is not None)

(when it returns tuple + when it needs extra processing as in this PR, I always have trouble to make it cleaner. Things get much easier if the internal components return dict or named tuple, and only change the format at the top level components for the users - but I don't think we are going to do so, at least not soon)

If this looks good, I will change other places.

This looks good!

Things get much easier if the internal components return dict or named tuple, and only change the format at the top level components for the users

This is a good idea, I think we can change the internal modules to only return either dict or Tuple

Thanks for the feedback :-) @patil-suraj . I will apply the same change to other places to finish this PR.
About changing internal components (in general), let's have a discussion later with other members.

patil-suraj · 2022-03-15T14:24:26Z

src/transformers/models/mbart/modeling_flax_mbart.py

+            if not output_hidden_states:
+                return (last_hidden_states,) + outputs[1:]
+            else:
+                return (last_hidden_states, hidden_states) + outputs[2:]


same comment as above

patrickvonplaten

Thanks!

ydshieh added 6 commits March 15, 2022 11:49

fix the last element in hidden_states

2f16c72

fix missing part

8402b28

same fix for Flax MBart

73dc49c

same fix for FlaxPegasus

d4c72c9

same fix for FlaxWav2Vec2

13608c7

take tuple into account

b20dc9f

ydshieh changed the title ~~Fix flax blenderbot hidden outputs~~ Fix some Flax models' hidden_states Mar 15, 2022

Fix missing elements in outputs for FlaxWav2Vec2EncoderLayerStableLay…

8916fc2

…erNormCollection

ydshieh requested review from patil-suraj and patrickvonplaten March 15, 2022 13:57

patil-suraj approved these changes Mar 15, 2022

View reviewed changes

ydshieh mentioned this pull request Mar 15, 2022

Make Flax pt-flax equivalence test more aggressive #15841

Merged

ydshieh added 2 commits March 15, 2022 15:59

try to simplify

415eb3c

continue to simplify

5679610

patrickvonplaten approved these changes Mar 15, 2022

View reviewed changes

ydshieh merged commit ea05d67 into huggingface:master Mar 15, 2022

ydshieh deleted the fix_flax_blenderbot_hidden_outputs branch March 15, 2022 18:06

ydshieh mentioned this pull request Mar 21, 2022

fix last element in hidden_states for XGLM #16301

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix some Flax models' `hidden_states`#16167

Fix some Flax models' `hidden_states`#16167
ydshieh merged 9 commits intohuggingface:masterfrom
ydshieh:fix_flax_blenderbot_hidden_outputs

ydshieh commented Mar 15, 2022 •

edited

Loading

Uh oh!

HuggingFaceDocBuilderDev commented Mar 15, 2022 •

edited

Loading

Uh oh!

patil-suraj left a comment

Uh oh!

patil-suraj Mar 15, 2022

Uh oh!

patil-suraj Mar 15, 2022

Uh oh!

ydshieh Mar 15, 2022 •

edited

Loading

Uh oh!

ydshieh Mar 15, 2022

Uh oh!

patil-suraj Mar 15, 2022

Uh oh!

ydshieh Mar 15, 2022

Uh oh!

patil-suraj Mar 15, 2022

Uh oh!

patrickvonplaten left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

	if not return_dict:
	outputs = (last_hidden_states, hidden_states) + (outputs[2:] if output_hidden_states else outputs[1:])
	return tuple(v for v in outputs if v is not None)

Conversation

ydshieh commented Mar 15, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

More context

Uh oh!

HuggingFaceDocBuilderDev commented Mar 15, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

patil-suraj left a comment

Choose a reason for hiding this comment

Uh oh!

patil-suraj Mar 15, 2022

Choose a reason for hiding this comment

Uh oh!

patil-suraj Mar 15, 2022

Choose a reason for hiding this comment

Uh oh!

ydshieh Mar 15, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ydshieh Mar 15, 2022

Choose a reason for hiding this comment

Uh oh!

patil-suraj Mar 15, 2022

Choose a reason for hiding this comment

Uh oh!

ydshieh Mar 15, 2022

Choose a reason for hiding this comment

Uh oh!

patil-suraj Mar 15, 2022

Choose a reason for hiding this comment

Uh oh!

patrickvonplaten left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

ydshieh commented Mar 15, 2022 •

edited

Loading

HuggingFaceDocBuilderDev commented Mar 15, 2022 •

edited

Loading

ydshieh Mar 15, 2022 •

edited

Loading