Conversation
|
The documentation is not available anymore as the PR was closed or merged. |
|
I took a quick look for now, and think it is indeed a nice addition, considering there is currently no example in |
ydshieh
left a comment
There was a problem hiding this comment.
Very nice & clean! I left a few tiny comments, but this PR is ready to be merged!
There was a problem hiding this comment.
Would be great to avoid import VisionEncoderDecoderModel twice
There was a problem hiding this comment.
Good catch! Removed it
There was a problem hiding this comment.
This is better to be after showing the dummy model, i.e. after the line
>>> model = VisionEncoderDecoderModel(encoder=encoder, decoder=decoder)
There was a problem hiding this comment.
Maybe add a comment here
>>> # init vision2text model with random weights
There was a problem hiding this comment.
Maybe put this line just before
>>> text = "hello world"
There was a problem hiding this comment.
just a nit: pixel_values before text 🙂
The reason is: pixel_values is the encoder part, and text is the decoder part. In encoder-decoder architecture, encoder is run before the decoder.
utils/documentation_tests.txt
Outdated
There was a problem hiding this comment.
nice ❤️!
Are you able to run the doctest locally and get it pass?
There was a problem hiding this comment.
Yep, runs fine locally! I ran the test again after addressing your comments
|
Addressed the comments and re-ran the test locally! Here's the failure: Essentially, without the additional line, the test consider that ``` is part of the expected output. But that additional line gets removed by make fixup. I haven't found a good way around this yet |
|
Those blank lines issues should be treated by the places regarding This means you don't need to add extra blank line in the file. Did you run |
|
Got it! I did run the script and it seems like it's not being applied to the trocr (but I can see the added lines on other model files). I'll debug that today |
|
OK, thank you. Don't hesitate to report it if you find this is some bug in |
Figured out the issue: there was some formatting issue in one of the docstring (see last commit) that prevented the script to correctly add a line. From splits were incorrect because of that, and the script wouldn't add an additional \n |
|
@ydshieh Is there anything else to do? I don't have writing access here, so I can't merge this |
There was a problem hiding this comment.
Hi, @arnaudstiegler,
I have left a few more comments: most of them are just some nits.
This PR is really ready, and I will merge once you can apply the suggestions 💯
Thank you so much!
There was a problem hiding this comment.
just a nit: pixel_values before text 🙂
The reason is: pixel_values is the encoder part, and text is the decoder part. In encoder-decoder architecture, encoder is run before the decoder.
There was a problem hiding this comment.
Would be great if you could add the expected loss here ❤️
There was a problem hiding this comment.
Added the loss. I actually changed the text to reflect the actual text on the image because the loss was super high with the dummy text (loss=22.00 vs loss=4.00). Let me know what's best
Also added rounding + setting the torch seed just in case
There was a problem hiding this comment.
Very nice for the change of target text! Awesome 👍
9466717 to
86c9c1c
Compare
|
Rebased on latest main and addressed the comments, the failing test is coming from main (failing on main as well) |
|
Hi, the failed check seems irrelevant to your PR. No need to fix it here 😃. I will give a final look and merge. Thank you 💗 |
|
Alright, thank you! |
|
By the way, may I wonder why you choose to display 3 decimal numbers for the loss value? I remember we use 2 decimal numbers in doc.py. I will check once I am available. |
|
Applied your changes, tested them locally, and ran make fixup as this was needed after the changes |
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
3d7134d to
3f6859d
Compare
|
If the remaining failed checks are build_pr_documentation and Add new model like template tests, you can keep the PR as it is. It is irrelevant I think. |
|
Ok, I tried rebasing on latest master and it doesn't seem to be doing the trick. Not sure what's causing that unfortunately, but unlikely to be due to the code changes I've done :) |
|
Thank you again @arnaudstiegler ! (also for your patience) I merge this PR now ❤️. |
Adding TrOCR to DocTests
For this model, there was actually a single docstring in the entire file (for the forward method).
A couple of comments:
Let me know if the docstring is relevant to the problem. I can also revert adapt it to actually just showcase the forward for the
TrOCRForCausalLMoutside of a VisionEncoderDecoder@patrickvonplaten @ydshieh @patil-suraj