Skip to content

Generate: have an example on each LogitsProcessor class docstring #24783

@gante

Description

@gante

Context

.generate() can be extensively manipulated through LogitsProcessor (and LogitsWarper) classes. Those classes are the code implementation behind flags like temperature or top_k.

Most of our LogitsProcessor classes have a docstring that briefly describes their effect. However, unless you are an expert in text generation, it's hard to fully grasp the impact of using each class. In some cases, it is also non-trivial to prepare the arguments to initialize the LogitsProcessor class. As such, each class should have a clear usage example with .generate() in their docstring! 💪

Here is an example: SequenceBiasLogitsProcessor docstring. Contrarily to the other classes (at the time of writing), we can quickly learn how to use it just by reading its docstring. We are also immediately aware of a few caveats 🤓

Bonus points: our docstring examples are part of our CI, so we would be beefing up our tests to ensure we don't add regressions 🤗

This issue is part of the text generation docs rework.

How to participate?

  1. Ensure you've read our contributing guidelines 📜
  2. Claim your LogitProcessor class in this thread (confirm no one is working on it). You can check the full list of classes below, and you can find their implementation in this file 🎯
    • You may need to do some detective work to fully understand the purpose of the class. For instance, some classes were created as part of a paper to be applied to any model, others are model-specific, and some exist to avoid weird bugs 🕵️
    • Looking at the git history is a great way to understand how a LogitsProcessor came to be.
  3. Implement your changes, taking the SequenceBiasLogitsProcessor docstring as reference 💪
    • Add a clear example that calls the processor through .generate(). Make sure the example's outputs are correct and that the model used in the test is a small model (anything larger than GPT2 needs explicit approval);
    • If you feel like the original docstring could be better, feel free to enhance it as well!
    • Don't forget to run make fixup before your final commit.
  4. Open the PR and tag me in it 🎊

Tracker

  • MinNewTokensLengthLogitsProcessor
  • TemperatureLogitsWarper
  • RepetitionPenaltyLogitsProcessor
  • EncoderRepetitionPenaltyLogitsProcessor
  • TopPLogitsWarper
  • TopKLogitsWarper
  • TypicalLogitsWarper
  • EpsilonLogitsWarper
  • EtaLogitsWarper
  • NoRepeatNGramLogitsProcessor
  • EncoderNoRepeatNGramLogitsProcessor
  • SequenceBiasLogitsProcessor
  • NoBadWordsLogitsProcessor
  • PrefixConstrainedLogitsProcessor
  • HammingDiversityLogitsProcessor
  • ForcedBOSTokenLogitsProcessor
  • ForcedEOSTokenLogitsProcessor
  • InfNanRemoveLogitsProcessor
  • ExponentialDecayLengthPenalty
  • LogitNormalization
  • SuppressTokensAtBeginLogitsProcessor
  • SuppressTokensLogitsProcessor
  • ForceTokensLogitsProcessor
  • WhisperTimeStampLogitsProcessor
  • ClassifierFreeGuidanceLogitsProcessor

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions