Skip to content

Multi predictions trainer#7126

Merged
sgugger merged 6 commits intomasterfrom
multi_predictions_trainer
Sep 15, 2020
Merged

Multi predictions trainer#7126
sgugger merged 6 commits intomasterfrom
multi_predictions_trainer

Conversation

@sgugger
Copy link
Collaborator

@sgugger sgugger commented Sep 14, 2020

This allows the Trainer to properly return predictions when the model has several outputs (for instance, all models XxxForMultipleChoice). This should unlock progress in #7032 where the start and end logits are both required.

@codecov
Copy link

codecov bot commented Sep 14, 2020

Codecov Report

Merging #7126 into master will increase coverage by 1.71%.
The diff coverage is 76.92%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #7126      +/-   ##
==========================================
+ Coverage   79.12%   80.84%   +1.71%     
==========================================
  Files         168      168              
  Lines       32303    32305       +2     
==========================================
+ Hits        25560    26117     +557     
+ Misses       6743     6188     -555     
Impacted Files Coverage Δ
src/transformers/trainer.py 55.04% <70.00%> (+0.13%) ⬆️
src/transformers/trainer_utils.py 60.29% <100.00%> (ø)
src/transformers/modeling_tf_funnel.py 18.53% <0.00%> (-75.51%) ⬇️
src/transformers/modeling_marian.py 60.00% <0.00%> (-30.00%) ⬇️
src/transformers/activations.py 85.00% <0.00%> (-5.00%) ⬇️
src/transformers/generation_tf_utils.py 84.21% <0.00%> (-1.76%) ⬇️
src/transformers/modeling_bart.py 93.77% <0.00%> (-0.51%) ⬇️
src/transformers/modeling_tf_bert.py 98.38% <0.00%> (-0.36%) ⬇️
src/transformers/tokenization_utils_base.py 93.77% <0.00%> (-0.28%) ⬇️
src/transformers/file_utils.py 82.84% <0.00%> (-0.25%) ⬇️
... and 17 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 52d250f...6eb9b5b. Read the comment docs.

Copy link
Member

@LysandreJik LysandreJik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't this fail if the model has output_attentions or output_hidden_states set to True in their configuration? the logits variable would be made of every output, not only the first one. If these are tuples they can't be used with .detach, for one, but it wouldn't even make sense to take them into account for the loss computation.

This works only if the model outputs values that should only be used to compute the loss, right?

@sgugger
Copy link
Collaborator Author

sgugger commented Sep 15, 2020

Yes this doesn't work if the model has output_attentions or output_hiddens in the config set to True: it needs the outputs to be a tuple of tensor with optional loss at the beginning. Documenting this is next when I make a template/protocol for models that Trainer supports. I can add a clean error message if one output is detected to be a tuple of tensors and add asserts that output_attention and output_all_hiddens are False in the config (if those arguments can be found).

Copy link
Member

@LysandreJik LysandreJik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, that sounds good to me. I guess users could override the prediction_step anyway if they didn't want to take the full range of inputs.

@julien-c
Copy link
Member

@sgugger are you saying that the eval in Trainer-backed https://github.com/huggingface/transformers/blob/master/examples/multiple-choice/run_multiple_choice.py is not currently working?

@LysandreJik
Copy link
Member

I'm pretty sure he meant XxxForQuestionAnswering, not multiple choice

@julien-c
Copy link
Member

I'm pretty sure he meant XxxForQuestionAnswering, not multiple choice

Oh yes, makes sense now. Thanks @LysandreJik ;)

@sgugger
Copy link
Collaborator Author

sgugger commented Sep 15, 2020

Yes @LysandreJik reads my thoughts right, sorry about the typo ;-)

Copy link
Member

@LysandreJik LysandreJik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice assertions.

@sgugger sgugger merged commit 7186ca6 into master Sep 15, 2020
@sgugger sgugger deleted the multi_predictions_trainer branch September 15, 2020 14:27
@sgugger sgugger mentioned this pull request Sep 16, 2020
Zigur pushed a commit to Zigur/transformers that referenced this pull request Oct 26, 2020
* Allow multiple outputs

* Formatting

* Move the unwrapping before metrics

* Fix typo

* Add test for non-supported config options
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants