Fixed incorrect normalization#40436
Conversation
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
|
@remi-or I think you had used 256 on purpose? Can you check that this changes isn't breaking? |
|
@yonigozlan I think I tested it against |
| image = torch.where(image > 255, 255, image) | ||
| image = torch.where(image < 0, 0, image) |
There was a problem hiding this comment.
btw, it might be more optimal to use torch.clamp(image, 0, 255) once instead of torch.where twice
There was a problem hiding this comment.
Normally I would agree! But this is on purpose : #38540
There was a problem hiding this comment.
Ok, according to the PR it seems we have to revert this PR to use 256 and keep torch.where. To prevent this code from further regression we have to either add tests that fails on CI (cuda) if modified or properly comment the code
There was a problem hiding this comment.
The regression this PR introduced already caused failures on the AMD CI, which is as important as NVIDIA (or cuda) CI!
As for properly commenting the code, both code paths where compile_friendly_resize is called are commented. You can check it out by expanding the diff, those lines are right above the function 🙂
If you want, we can add # this is to match torchvision.resize next to 256 and # We use torch.where instead of torch.clamp to avoid an error with torch.compile as comments to make sure no one will introduce the regression again. Wdyt?
There was a problem hiding this comment.
@remi-or absolutely agree AMD CI is as important as NVIDIA, what I'm trying to say is that we need a test that fails in PR's CI to prevent merging this PR. In terms of comments, yeah, it's better to comment non-obvious code right in place, otherwise it looks like a typo and is easy to miss the comment located in a different part (and that's happened in this case).
I'll do a quick fix for this, thanks for jumping in and clarifying 🤗
I've notices a possible typo in s
rc/transformers/image_processing_utils_fast.py#compile_friendly_resizeforuin8the normalization is done slightly off with256instead of255, which still works because its done consistenly (its normalized and denormalized the same way) incorrect.image = image.float() / 256
image = image * 256
The exaplanation is simple:
255(the max value for a uint8) should map to1.but it doesn't with the current implementation