Add descriptive docstring to WhisperTimeStampLogitsProcessor by jprivera44 · Pull Request #25642 · huggingface/transformers

jprivera44 · 2023-08-22T00:09:08Z

What does this PR do?

This PR adds in docstrings that explains the usage of the arguments that can be passed in to the Whisper logits processor.

Fixes #24783

Before submitting

[ x] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).

Who can review?

@gante

HuggingFaceDocBuilderDev · 2023-08-22T06:24:28Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

ArthurZucker · 2023-08-22T06:58:24Z

Hey! Thanks for opening a PR!
This looks a bit too specific / long so will not be accepting it. Also you should run pytest --doctest src/transformers/generation/logits_process.py / rebase to main to check if this the docstring examples actually work 😉

gante

Hi @jprivera44 👋 Thank you for opening the PR :)

I have an important request here -- let's aim at having a short example with short outputs. ~20 lines of code is the sweet spot. Showing an example where we control returning the timestamps would be enough :)

src/transformers/generation/logits_process.py

gante · 2023-08-22T15:30:26Z

src/transformers/generation/logits_process.py

+
+

Formatting: we only leave at most one line between example lines

Adding in suggested fix to the LogitProcessor description. Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

Removing tip per suggestion. Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

Removing redundant code per suggestion. Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

jprivera44 · 2023-09-02T02:59:09Z

Hi @gante thank you for your feedback :) I've made the suggested changes. However, I'm a little bit confused on the returning of the timestamps? The WhisperTimeStampLogitsProcessor only returns the scores, which then get transcribed into words. Would outputting the token timestamps be sufficient? Since these are directly accessible within modeling_whisper.py.

@ArthurZucker , I ran pytest --doctest-modules and all tests within the WhisperLogits passed. I noticed the recommendation for --doctest – does the transformers repo use a custom configuration that requires this argument?

gante · 2023-09-05T10:57:40Z

@jprivera44 Since the last reviews, we've added this file to the list of files to be doctested in our PR CI. If you rebase with main, your examples will be tested :)

Regarding the examples themselves... something is odd. We should be getting a timestamp at the start of the transcription, right @ArthurZucker?

ArthurZucker

The timestamps are ignored by the skip_special_tokens = True, let's try to show them

ArthurZucker · 2023-09-05T13:23:54Z

src/transformers/generation/logits_process.py

+    >>> #This allows the user to select a specific token to terminate the sequence on, in this case it's the word poem(21247)
+    >>> model.generation_config.eos_token_id = 21247
+    >>> generated_ids = model.generate(inputs=input_features,return_timestamps=True)
+    >>> transcription = processor.batch_decode(generated_ids, skip_special_tokens=True)[0]


Suggested change

>>> transcription = processor.batch_decode(generated_ids, skip_special_tokens=True)[0]

>>> transcription = processor.batch_decode(generated_ids, skip_special_tokens=False)[0]

Let's have one example with timestamps no?

@jprivera44 let's a) limit the number of examples (one with and another without timestamps is enough); b) use Arthur's suggestion here :)

gante · 2023-09-12T11:52:09Z

@jprivera44 double-checking: you are still planning on iterating on this PR, correct? :)

jprivera44 · 2023-09-14T00:42:05Z

Hello @gante, apologies I was having some issues getting the time stamps to show up, but it's all fixed now. I've added in the updated code we discussed with two examples, one with timestamps and one without. This includes running pytest as @ArthurZucker mentioned. Please let me know what else is needed, and thank you again for the help!

jprivera44 · 2023-09-17T02:59:39Z

Hello @gante hope you're well, is there anything else needed on this from my end? I just want to make sure I've completed the requirements.

ArthurZucker

Hey, @gante is OOO for a while; I'll take this over

ArthurZucker · 2023-09-18T21:59:55Z

src/transformers/generation/logits_process.py

+    [`LogitsProcessor`] Whisper specific Processor. This processor is specifically designed for the Whisper Automatic
+    Speech Recognition model. It facilitates the manipulation of log probabilities for a predefined list of tokens
+    during text generation. By using this processor, you can effectively "force" certain tokens to be selected at
+    specific positions in the generated sequence. When tokens are passed to this processor, their log probabilities are
+    set to `inf` (infinity), ensuring that they are chosen at their corresponding indices.


This is more general but feels AI generated and is partly wrong. If we add docstring, let's explain in details what the processor does!

Ah I see what you mean, I'll go ahead and take a look to add in more details and fix the mistakes.

@ArthurZucker I've updated the docstring, please let me know what you think? I appreciate the help on this.

src/transformers/generation/logits_process.py

ArthurZucker · 2023-09-18T22:01:22Z

src/transformers/generation/logits_process.py

+
+
+    >>> #No timestamps & change EOS:
+    >>> #This allows the user to select a specific token to terminate the sequence on, in this case it's the word poem(21247)


looks like it was not stopped on poem no?

Yes, the phrase stops on token in an order to showcase how changing the EOS_token parameter changes the output of the transcription. If preferred, I can only showcase the generating of timestamps vs. non-timestamps.

My point is that the doc says it stops on poem.

Oh I see what you mean, and I just went through and made the fix.

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

ArthurZucker

Hey! Again, let's make this helpful for users 😉

ArthurZucker · 2023-09-26T16:57:33Z

src/transformers/generation/logits_process.py

+    [`LogitsProcessor`] that modifies the logits for the generation of timestamps in the transcription. When the input
+    tokens are at a specific threshold, the processor sets the scores to negative infinity and returned. The timestamps
+    are processed to be in pairs except right before the end-of-sequence token. If the total probability of all
+    timestamp tokens is greater than any individual non-timestamp token, the processor sets those non-timestamp logits


Suggested change

[`LogitsProcessor`] that modifies the logits for the generation of timestamps in the transcription. When the input

tokens are at a specific threshold, the processor sets the scores to negative infinity and returned. The timestamps

are processed to be in pairs except right before the end-of-sequence token. If the total probability of all

timestamp tokens is greater than any individual non-timestamp token, the processor sets those non-timestamp logits

[`LogitsProcessor`] that modifies the logits for the generation of timestamps in the transcription. When the input

tokens are at a specific threshold, the processor sets the scores to negative infinity. The processor makes sure that timestamp tokens appear in pair, by doing .... This is done because ......

It also ensure that when the predicted probability of sampling any of the timestamp token is greater than ........ This is done to ensure that.....

github-actions · 2023-10-21T08:05:29Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

jprivera44 · 2023-10-23T20:06:03Z

Hi sorry this got to the stale status, @ArthurZucker got in an update on this! Thank you again.

ArthurZucker

A lot better thanks 😉

jprivera44 · 2023-10-24T17:13:53Z

No thank you guys! Apologies this took a while!

…face#25642) * adding in logit examples for Whisper processor * adding in updated logits processor for Whisper * adding in cleaned version of logits processor for Whisper * adding docstrings for whisper processor * making sure the formatting is correct * adding logits after doc builder * Update src/transformers/generation/logits_process.py Adding in suggested fix to the LogitProcessor description. Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update src/transformers/generation/logits_process.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update src/transformers/generation/logits_process.py Removing tip per suggestion. Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update src/transformers/generation/logits_process.py Removing redundant code per suggestion. Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * adding in revised version * adding in version with timestamp examples * Update src/transformers/generation/logits_process.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * enhanced paragraph on behavior of processor * fixing doc quality issue * removing the word poem from example * adding in updated docstring * adding in new version of file after doc-builder --------- Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

jprivera44 added 6 commits August 21, 2023 20:44

adding in logit examples for Whisper processor

887cb5b

adding in updated logits processor for Whisper

0f25008

adding in cleaned version of logits processor for Whisper

c51eaa8

adding docstrings for whisper processor

bf4ffd3

making sure the formatting is correct

ed8b912

adding logits after doc builder

ad1de88

gante reviewed Aug 22, 2023

View reviewed changes

jprivera44 and others added 5 commits August 31, 2023 14:03

Update src/transformers/generation/logits_process.py

896e4d6

Adding in suggested fix to the LogitProcessor description. Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

Update src/transformers/generation/logits_process.py

4bda94f

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

Update src/transformers/generation/logits_process.py

6d21880

Removing tip per suggestion. Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

Update src/transformers/generation/logits_process.py

a6aa3ea

Removing redundant code per suggestion. Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

adding in revised version

311bd8b

ArthurZucker reviewed Sep 5, 2023

View reviewed changes

adding in version with timestamp examples

4860049

ArthurZucker reviewed Sep 18, 2023

View reviewed changes

jprivera44 and others added 4 commits September 20, 2023 14:25

Update src/transformers/generation/logits_process.py

77a081b

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

enhanced paragraph on behavior of processor

60dfab0

fixing doc quality issue

0dcc407

removing the word poem from example

9621562

ArthurZucker reviewed Sep 26, 2023

View reviewed changes

jprivera44 added 2 commits October 23, 2023 15:28

adding in updated docstring

fd1049d

adding in new version of file after doc-builder

1659317

ArthurZucker approved these changes Oct 24, 2023

View reviewed changes

ArthurZucker merged commit 576e282 into huggingface:main Oct 24, 2023

	>>> transcription = processor.batch_decode(generated_ids, skip_special_tokens=True)[0]
	>>> transcription = processor.batch_decode(generated_ids, skip_special_tokens=False)[0]



		>>> #No timestamps & change EOS:
		>>> #This allows the user to select a specific token to terminate the sequence on, in this case it's the word poem(21247)

Conversation

jprivera44 commented Aug 22, 2023

What does this PR do?

Before submitting

Who can review?

Uh oh!

HuggingFaceDocBuilderDev commented Aug 22, 2023

Uh oh!

ArthurZucker commented Aug 22, 2023

Uh oh!

gante left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jprivera44 commented Sep 2, 2023

Uh oh!

gante commented Sep 5, 2023

Uh oh!

ArthurZucker left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gante commented Sep 12, 2023

Uh oh!

jprivera44 commented Sep 14, 2023

Uh oh!

jprivera44 commented Sep 17, 2023

Uh oh!

ArthurZucker left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ArthurZucker left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Oct 21, 2023

Uh oh!

jprivera44 commented Oct 23, 2023

Uh oh!

ArthurZucker left a comment

Choose a reason for hiding this comment

Uh oh!

jprivera44 commented Oct 24, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

ArthurZucker left a comment •

edited

Loading