-
Notifications
You must be signed in to change notification settings - Fork 32.4k
Description
Context
.generate() can be extensively manipulated through LogitsProcessor (and LogitsWarper) classes. Those classes are the code implementation behind flags like temperature or top_k.
Most of our LogitsProcessor classes have a docstring that briefly describes their effect. However, unless you are an expert in text generation, it's hard to fully grasp the impact of using each class. In some cases, it is also non-trivial to prepare the arguments to initialize the LogitsProcessor class. As such, each class should have a clear usage example with .generate() in their docstring! 💪
Here is an example: SequenceBiasLogitsProcessor docstring. Contrarily to the other classes (at the time of writing), we can quickly learn how to use it just by reading its docstring. We are also immediately aware of a few caveats 🤓
Bonus points: our docstring examples are part of our CI, so we would be beefing up our tests to ensure we don't add regressions 🤗
This issue is part of the text generation docs rework.
How to participate?
- Ensure you've read our contributing guidelines 📜
- Claim your
LogitProcessorclass in this thread (confirm no one is working on it). You can check the full list of classes below, and you can find their implementation in this file 🎯- You may need to do some detective work to fully understand the purpose of the class. For instance, some classes were created as part of a paper to be applied to any model, others are model-specific, and some exist to avoid weird bugs 🕵️
- Looking at the git history is a great way to understand how a
LogitsProcessorcame to be.
- Implement your changes, taking the SequenceBiasLogitsProcessor docstring as reference 💪
- Add a clear example that calls the processor through
.generate(). Make sure the example's outputs are correct and that the model used in the test is a small model (anything larger than GPT2 needs explicit approval); - If you feel like the original docstring could be better, feel free to enhance it as well!
- Don't forget to run
make fixupbefore your final commit.
- Add a clear example that calls the processor through
- Open the PR and tag me in it 🎊
Tracker
- MinNewTokensLengthLogitsProcessor
- TemperatureLogitsWarper
- RepetitionPenaltyLogitsProcessor
- EncoderRepetitionPenaltyLogitsProcessor
- TopPLogitsWarper
- TopKLogitsWarper
- TypicalLogitsWarper
- EpsilonLogitsWarper
- EtaLogitsWarper
- NoRepeatNGramLogitsProcessor
- EncoderNoRepeatNGramLogitsProcessor
- SequenceBiasLogitsProcessor
- NoBadWordsLogitsProcessor
- PrefixConstrainedLogitsProcessor
- HammingDiversityLogitsProcessor
- ForcedBOSTokenLogitsProcessor
- ForcedEOSTokenLogitsProcessor
- InfNanRemoveLogitsProcessor
- ExponentialDecayLengthPenalty
- LogitNormalization
- SuppressTokensAtBeginLogitsProcessor
- SuppressTokensLogitsProcessor
- ForceTokensLogitsProcessor
- WhisperTimeStampLogitsProcessor
- ClassifierFreeGuidanceLogitsProcessor