Skip to content

[DPT] Add MiDaS 3.1 series#25799

Closed
NielsRogge wants to merge 28 commits intohuggingface:mainfrom
NielsRogge:improve_dpt
Closed

[DPT] Add MiDaS 3.1 series#25799
NielsRogge wants to merge 28 commits intohuggingface:mainfrom
NielsRogge:improve_dpt

Conversation

@NielsRogge
Copy link
Contributor

@NielsRogge NielsRogge commented Aug 28, 2023

What does this PR do?

This PR improves the DPT model by leveraging the AutoBackbone API.

DPT is a depth estimation model. Recently, the MiDaS team released a new 3.1 version with various backbones: BEiT, Swinv2, etc. hence it's an ideal use case for the AutoBackbone class.

This PR:

  • adds the BeitBackbone class
  • adds the Swinv2Backbone class
  • extends modeling_dpt.py to leverage the AutoBackbone API
  • fixes the keep_aspect_ratio and ensure_multiple_of flags of DPTImageProcessor, which does not work on main due to them not being passed to the resize method.

To do:

  • make sure out_indices are backwards compatible for BEiT

If `do_resize` is `True`, the image is resized to a size that is a multiple of this value. Can be overidden
by `ensure_multiple_of` in `preprocess`.
resample (`PILImageResampling`, *optional*, defaults to `PILImageResampling.BILINEAR`):
resample (`PILImageResampling`, *optional*, defaults to `PILImageResampling.BICUBIC`):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a (slight) breaking change to make sure the same interpolation method is used as in the original implementation. However, Pillow's BICUBIC method does not 100% match the one of OpenCV :/ cc @amyeroberts

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's never 1:1 correspondence 😢

This is OK as saved models will have the resampling filter saved in the preprocessor config, and as you say it brings it in line with the original

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

@NielsRogge NielsRogge mentioned this pull request Sep 4, 2023
2 tasks
@NielsRogge
Copy link
Contributor Author

I've split up the PR in smaller pieces, see above for the first one

Copy link
Contributor

@amyeroberts amyeroberts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the work adding this!

Did a high-level review as the PR isn't in a finished state yet. At the moment, the changes to the Beit config and data2vec model can't be merged in as they're breaking. Beit is also one of our most popular models, so it's important for us to get this right.

cc @rafaelpadilla

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be removed

drop_path_rate=0.1,
use_mean_pooling=True,
out_indices=[3, 5, 7, 11],
semantic_out_indices=[3, 5, 7, 11],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't make this change because of backwards compatibility. If someone loads in their config and they don't have the default value for out_indices then their model's behaviour will have changed. Moreover, if they try to set or change out_indices, which they might have done in their own code, this won't be correctly updated here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To delete

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To delete

Comment on lines +439 to +441
@unittest.skip(reason="Swinv2 does not support feedforward chunking yet")
def test_feed_forward_chunking(self):
pass
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is being added then it must have supported it previously

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To delete

If `do_resize` is `True`, the image is resized to a size that is a multiple of this value. Can be overidden
by `ensure_multiple_of` in `preprocess`.
resample (`PILImageResampling`, *optional*, defaults to `PILImageResampling.BILINEAR`):
resample (`PILImageResampling`, *optional*, defaults to `PILImageResampling.BICUBIC`):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's never 1:1 correspondence 😢

This is OK as saved models will have the resampling filter saved in the preprocessor config, and as you say it brings it in line with the original

self.patch_size = None if use_autobackbone else patch_size
self.num_channels = None if use_autobackbone else num_channels
self.qkv_bias = None if use_autobackbone else qkv_bias
self.backbone_out_indices = None if use_autobackbone else backbone_out_indices
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of these, I see why they're not set if we use AutoBackbone, but I believe e.g. layer_norm_eps is still needed for other parts of the model.

always_partition: Optional[bool] = False,
) -> Tuple[torch.Tensor, torch.Tensor]:
if not always_partition:
self.set_shift_and_window_size(input_dimensions)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is removing this backwards compatible? Previously self.set_shift_and_window_size(input_dimensions) was being called by default

@github-actions
Copy link
Contributor

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

@github-actions github-actions bot closed this Oct 20, 2023
@NielsRogge NielsRogge reopened this Oct 20, 2023
@github-actions github-actions bot closed this Oct 29, 2023
@NielsRogge NielsRogge reopened this Oct 30, 2023
@github-actions github-actions bot closed this Nov 8, 2023
@NielsRogge NielsRogge reopened this Nov 13, 2023
@github-actions github-actions bot closed this Nov 22, 2023
@NielsRogge NielsRogge mentioned this pull request Nov 28, 2023
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants