Generate: deprecate the use of model `config` as a source of defaults

EDIT: Updated with the discussion up to [2022/08/20](https://github.com/huggingface/transformers/issues/18655#issuecomment-1221047772)

## Why?

A confusing part of `generate` is how the defaults are set. When a certain argument is not specified, we attempt to fetch it from the model `config` file. This makes `generate` unpredictable and hard to fully document (the default values change for each model), as well as a major source of issues :hocho:

## How?

We have the following requirements:
1️⃣  The existing behavior can’t be removed, i.e., we must be able to use the model `config.json` as a source of generation parameters by default;
2️⃣  We do need per-model defaults -- some models are designed to do a certain thing (e.g. summarization), which requires a specific generation configuration. 
3️⃣  Users must have full control over generate, with minimal hidden behavior.

Ideally, we also want to:
4️⃣  Have separation of concerns and use a new `generate_config.json` to parameterize generation;

A TL;DR of the plan consists in changing the paradigm from “non-specified `generate` arguments are overridden by the [model] configuration file” to “`generate` arguments will override the [generate] configuration file, which is always used”. With proper documentation changes and logging/warnings, the user will be aware of what's being set for `generate`.

### Step 1: Define a new generate config file and class

Similar to the model config, we want a `.json` file to store the generation defaults. The class itself can be a very simplified version of `PretrainedConfig`, also with functionality to load/store from the hub.

### Step 2: Integrate loading generate config file in `.from_pretrained()`

The generation configuration file should be loaded when initializing the model with a `from_pretrained()` method. A couple of things to keep in mind:
1. There will be a new `kwarg` in `from_pretrained`, `generate_config` (or `generation_config`? Leaning toward the former as it has the same name as the function);
2. It will default to `generate_config.json` (contrarily to the model `config`, which defaults to `None`). This will allow users to set this argument to `None`, to load a model with an empty generate config. Some users have requested a feature like this;
3. Because the argument can take a path, it means that users can store/load multiple generate configs if they wish to do so (e.g. to use the same model for summarization, creative generation, factual question-answering, etc) 🚀 
5. Only models that can run `generate` will attempt to load it;
6. If there is no `generate_config.json` in the repo, it will attempt to initialize the generate configuration from the model `config.json`. This means that this solution will not change any `generate` behavior and will NOT need a major release 👼
7. To keep the user in the loop, log ALL parameters set when loading the generation config file. Something like the snippet below.
8. Because this happens at `from_pretrained()` time, logging will only happen at most once and will not be verbose.

```
`facebook/opt-1.3b` generate configuration loaded from `generate_config.json`. The following generation defaults were set:
- max_length: 20
- foo: bar
- baz: qux
``` 
### Step 3: Generate uses the generate config class internally

Instead of using the configuration to override arguments when they are not set, overwrite a copy of the generation config at `generate` time. I.e. instead of:

```
arg = arg if arg is not None else self.config.arg
...
```

do

```
generate_config = self.generate_config.copy()
generate_config.arg = arg if arg is not None
...
```

This change has three main benefits:
1. We can improve the readability of the code, as we gain the ability to pass configs around. E.g. [this function](https://github.com/huggingface/transformers/blob/30992ef0d911bdeca425969d210771118a5cd1ac/src/transformers/generation_utils.py#L674) won't need to take a large list of arguments nor to bother with their initialization.
2. Building `generate` argument validation *for each type of generation* can be built in simple functions that don't need ~30 arguments as input 🙃
3. The three frameworks (PT/TF/FLAX) can share functionality like argument validation, decreasing maintenance burden.

### Step 4: Document and open PRs with the generation config file

Rewrite part of the documentation to explain that a generation config is ALWAYS used (regardless of having defaults loaded from the hub or not). Open Hub PRs to pull generate-specific parameters from `config.json` to `generate_config.json`

## Pros/Cons

Pros:
- Better awareness -- any `generate` default will be logged to the screen when loading a generate-compatible model;
- Full control -- the users can choose NOT to load generation parameters or easily load a set of options from an arbitrary file;
- Enables more readable `generate` code;
- Enables sharing `generate`-related code across frameworks;
- Doesn't need a major release.

Cons:
- Pulling the generate parameters into their own files won't happen everywhere, as merging the changes described in step 4 is not feasible for all models (e.g. due to unresponsive model owners);
- Logging loaded defaults may not be enough to stop issues related to the default values, as the logs can be ignored;
- Another config file (and related code) to maintain.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Generate: deprecate the use of model `config` as a source of defaults #18655

Why?

How?

Step 1: Define a new generate config file and class

Step 2: Integrate loading generate config file in `.from_pretrained()`

Step 3: Generate uses the generate config class internally

Step 4: Document and open PRs with the generation config file

Pros/Cons

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Generate: deprecate the use of model config as a source of defaults #18655

Description

Why?

How?

Step 1: Define a new generate config file and class

Step 2: Integrate loading generate config file in .from_pretrained()

Step 3: Generate uses the generate config class internally

Step 4: Document and open PRs with the generation config file

Pros/Cons

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Generate: deprecate the use of model `config` as a source of defaults #18655

Step 2: Integrate loading generate config file in `.from_pretrained()`