Conversation
|
The documentation is not available anymore as the PR was closed or merged. |
|
No response, but we should probably merge anyway. Pinging @amyeroberts for core maintainer review! |
amyeroberts
left a comment
There was a problem hiding this comment.
Thanks for fixing this!
Just a question about the checks and values the inputs can have in _prepare_decoder_attention_mask
| _, seq_length = input_shape | ||
| tf.debugging.assert_equal( | ||
| seq_length + past_key_values_length, | ||
| shape_list(attention_mask)[1], |
There was a problem hiding this comment.
Is this check robust? From the diff it looks like attention_mask can be None
There was a problem hiding this comment.
Yes! The TFOPTDecoder layer checks for None attention masks and replaces them with tf.ones. That happens before _prepare_decoder_attention_mask is called. The earlier code had an if attention_mask is not None branch that was just always taken as a result.
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
|
@amyeroberts Sorry for the delay, I lost track of this one! |
* stash commit * More OPT updates * Update src/transformers/models/opt/modeling_tf_opt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
With apologies for the delay, this PR should hopefully resolve the issues in #24637. @abb128 can you please try installing from this PR and verify if it resolves your issues? You can install from this PR with:
pip install --upgrade git+https://github.com/huggingface/transformers.git@tf_opt_fixesFixes #24637