Conversation
| If `do_resize` is `True`, the image is resized to a size that is a multiple of this value. Can be overidden | ||
| by `ensure_multiple_of` in `preprocess`. | ||
| resample (`PILImageResampling`, *optional*, defaults to `PILImageResampling.BILINEAR`): | ||
| resample (`PILImageResampling`, *optional*, defaults to `PILImageResampling.BICUBIC`): |
There was a problem hiding this comment.
This is a (slight) breaking change to make sure the same interpolation method is used as in the original implementation. However, Pillow's BICUBIC method does not 100% match the one of OpenCV :/ cc @amyeroberts
There was a problem hiding this comment.
There's never 1:1 correspondence 😢
This is OK as saved models will have the resampling filter saved in the preprocessor config, and as you say it brings it in line with the original
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. |
|
I've split up the PR in smaller pieces, see above for the first one |
amyeroberts
left a comment
There was a problem hiding this comment.
Thanks for the work adding this!
Did a high-level review as the PR isn't in a finished state yet. At the moment, the changes to the Beit config and data2vec model can't be merged in as they're breaking. Beit is also one of our most popular models, so it's important for us to get this right.
| drop_path_rate=0.1, | ||
| use_mean_pooling=True, | ||
| out_indices=[3, 5, 7, 11], | ||
| semantic_out_indices=[3, 5, 7, 11], |
There was a problem hiding this comment.
We can't make this change because of backwards compatibility. If someone loads in their config and they don't have the default value for out_indices then their model's behaviour will have changed. Moreover, if they try to set or change out_indices, which they might have done in their own code, this won't be correctly updated here.
| @unittest.skip(reason="Swinv2 does not support feedforward chunking yet") | ||
| def test_feed_forward_chunking(self): | ||
| pass |
There was a problem hiding this comment.
If this is being added then it must have supported it previously
| If `do_resize` is `True`, the image is resized to a size that is a multiple of this value. Can be overidden | ||
| by `ensure_multiple_of` in `preprocess`. | ||
| resample (`PILImageResampling`, *optional*, defaults to `PILImageResampling.BILINEAR`): | ||
| resample (`PILImageResampling`, *optional*, defaults to `PILImageResampling.BICUBIC`): |
There was a problem hiding this comment.
There's never 1:1 correspondence 😢
This is OK as saved models will have the resampling filter saved in the preprocessor config, and as you say it brings it in line with the original
| self.patch_size = None if use_autobackbone else patch_size | ||
| self.num_channels = None if use_autobackbone else num_channels | ||
| self.qkv_bias = None if use_autobackbone else qkv_bias | ||
| self.backbone_out_indices = None if use_autobackbone else backbone_out_indices |
There was a problem hiding this comment.
Some of these, I see why they're not set if we use AutoBackbone, but I believe e.g. layer_norm_eps is still needed for other parts of the model.
| always_partition: Optional[bool] = False, | ||
| ) -> Tuple[torch.Tensor, torch.Tensor]: | ||
| if not always_partition: | ||
| self.set_shift_and_window_size(input_dimensions) |
There was a problem hiding this comment.
Is removing this backwards compatible? Previously self.set_shift_and_window_size(input_dimensions) was being called by default
|
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Please note that issues that do not follow the contributing guidelines are likely to be ignored. |
What does this PR do?
This PR improves the DPT model by leveraging the
AutoBackboneAPI.DPT is a depth estimation model. Recently, the MiDaS team released a new 3.1 version with various backbones: BEiT, Swinv2, etc. hence it's an ideal use case for the
AutoBackboneclass.This PR:
BeitBackboneclassSwinv2Backboneclasskeep_aspect_ratioandensure_multiple_offlags ofDPTImageProcessor, which does not work on main due to them not being passed to theresizemethod.To do:
out_indicesare backwards compatible for BEiT