🚀 An advice about changing variable name from "attention_mask" to "adder"

# 🚀 Feature request



## Motivation
I noticed that some users are pretty confused when reading source codes about variable `attention_mask`
like:
[What is the meaning of Attention Mask #205](https://github.com/huggingface/transformers/issues/205)
[Clarifying attention mask #542](https://github.com/huggingface/transformers/issues/542)
And I refer to the origional BERT repository - [google-research/bert](https://github.com/google-research/bert). Compared to the origin, I find in this repo sometimes the concepts of `attention_mask` and `adder` are mixed.

refering original BERT: [./modeling.py#L707](https://github.com/google-research/bert/blob/master/modeling.py#L707)
```python
attention_mask = tf.expand_dims(attention_mask, axis=[1])
adder = (1.0 - tf.cast(attention_mask, tf.float32)) * -10000.0
attention_scores += adder
```

But in this repo: take [src/transformers/modeling_tf_openai.py#L282](https://github.com/huggingface/transformers/blob/master/src/transformers/modeling_tf_openai.py#L282) as an example:
```python
attention_mask = attention_mask[:, tf.newaxis, tf.newaxis, :]
attention_mask = tf.cast(attention_mask, tf.float32)
attention_mask = (1.0 - attention_mask) * -10000.0
```
and inside the method `TFAttention._attn()` [src/transformers/modeling_tf_openai.py#L112](https://github.com/huggingface/transformers/blob/master/src/transformers/modeling_tf_openai.py#L112):
```python
if attention_mask is not None:
  # Apply the attention mask
  w = w + attention_mask
```


## Your contribution
may be changing its name is way better?
like:
```python
attention_mask = attention_mask[:, tf.newaxis, tf.newaxis, :]
attention_mask = tf.cast(attention_mask, tf.float32)
adder = (1.0 - attention_mask) * -10000.0
```
and then:
```python
if adder is not None:
  # Apply the attention mask
  attention_score = w + adder
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

🚀 An advice about changing variable name from "attention_mask" to "adder" #4141

🚀 Feature request

Motivation

Your contribution

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

🚀 An advice about changing variable name from "attention_mask" to "adder" #4141

Description

🚀 Feature request

Motivation

Your contribution

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions