-
Notifications
You must be signed in to change notification settings - Fork 32.5k
Description
EDIT: Updated with the discussion up to 2022/08/20
Why?
A confusing part of generate is how the defaults are set. When a certain argument is not specified, we attempt to fetch it from the model config file. This makes generate unpredictable and hard to fully document (the default values change for each model), as well as a major source of issues 🔪
How?
We have the following requirements:
1️⃣ The existing behavior can’t be removed, i.e., we must be able to use the model config.json as a source of generation parameters by default;
2️⃣ We do need per-model defaults -- some models are designed to do a certain thing (e.g. summarization), which requires a specific generation configuration.
3️⃣ Users must have full control over generate, with minimal hidden behavior.
Ideally, we also want to:
4️⃣ Have separation of concerns and use a new generate_config.json to parameterize generation;
A TL;DR of the plan consists in changing the paradigm from “non-specified generate arguments are overridden by the [model] configuration file” to “generate arguments will override the [generate] configuration file, which is always used”. With proper documentation changes and logging/warnings, the user will be aware of what's being set for generate.
Step 1: Define a new generate config file and class
Similar to the model config, we want a .json file to store the generation defaults. The class itself can be a very simplified version of PretrainedConfig, also with functionality to load/store from the hub.
Step 2: Integrate loading generate config file in .from_pretrained()
The generation configuration file should be loaded when initializing the model with a from_pretrained() method. A couple of things to keep in mind:
- There will be a new
kwarginfrom_pretrained,generate_config(orgeneration_config? Leaning toward the former as it has the same name as the function); - It will default to
generate_config.json(contrarily to the modelconfig, which defaults toNone). This will allow users to set this argument toNone, to load a model with an empty generate config. Some users have requested a feature like this; - Because the argument can take a path, it means that users can store/load multiple generate configs if they wish to do so (e.g. to use the same model for summarization, creative generation, factual question-answering, etc) 🚀
- Only models that can run
generatewill attempt to load it; - If there is no
generate_config.jsonin the repo, it will attempt to initialize the generate configuration from the modelconfig.json. This means that this solution will not change anygeneratebehavior and will NOT need a major release 👼 - To keep the user in the loop, log ALL parameters set when loading the generation config file. Something like the snippet below.
- Because this happens at
from_pretrained()time, logging will only happen at most once and will not be verbose.
`facebook/opt-1.3b` generate configuration loaded from `generate_config.json`. The following generation defaults were set:
- max_length: 20
- foo: bar
- baz: qux
Step 3: Generate uses the generate config class internally
Instead of using the configuration to override arguments when they are not set, overwrite a copy of the generation config at generate time. I.e. instead of:
arg = arg if arg is not None else self.config.arg
...
do
generate_config = self.generate_config.copy()
generate_config.arg = arg if arg is not None
...
This change has three main benefits:
- We can improve the readability of the code, as we gain the ability to pass configs around. E.g. this function won't need to take a large list of arguments nor to bother with their initialization.
- Building
generateargument validation for each type of generation can be built in simple functions that don't need ~30 arguments as input 🙃 - The three frameworks (PT/TF/FLAX) can share functionality like argument validation, decreasing maintenance burden.
Step 4: Document and open PRs with the generation config file
Rewrite part of the documentation to explain that a generation config is ALWAYS used (regardless of having defaults loaded from the hub or not). Open Hub PRs to pull generate-specific parameters from config.json to generate_config.json
Pros/Cons
Pros:
- Better awareness -- any
generatedefault will be logged to the screen when loading a generate-compatible model; - Full control -- the users can choose NOT to load generation parameters or easily load a set of options from an arbitrary file;
- Enables more readable
generatecode; - Enables sharing
generate-related code across frameworks; - Doesn't need a major release.
Cons:
- Pulling the generate parameters into their own files won't happen everywhere, as merging the changes described in step 4 is not feasible for all models (e.g. due to unresponsive model owners);
- Logging loaded defaults may not be enough to stop issues related to the default values, as the logs can be ignored;
- Another config file (and related code) to maintain.