Conversation
|
run-slow: vit |
|
This comment contains run-slow, running the specified jobs: models: ['models/vit'] |
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
|
run-slow: vit |
|
This comment contains run-slow, running the specified jobs: models: ['models/vit'] |
ArthurZucker
left a comment
There was a problem hiding this comment.
🧼 very clean! The only nit is that we should not remove hs colllectionn when its core for the model
There was a problem hiding this comment.
😢 it's so nice!
Really unbloats, thanks for working on this!
| """ | ||
| ) | ||
| class BitBackbone(BitPreTrainedModel, BackboneMixin): | ||
| has_attentions = False |
There was a problem hiding this comment.
It is not that bat TBH! explicits that there is not attention
| @check_model_inputs | ||
| def _forward_with_additional_outputs( | ||
| self, pixel_values: torch.Tensor, **kwargs: Unpack[TransformersKwargs] | ||
| ) -> BaseModelOutput: | ||
| """Additional forward to capture intermediate outputs by `check_model_inputs` decorator""" | ||
| embedding_output = self.embeddings(pixel_values) | ||
| output = self.encoder(embedding_output) | ||
| return output |
There was a problem hiding this comment.
Well, here it is a place where it does not make sense to hide the collection of hidden_states, because it would always be set to True since you NEED them for feature maps right?
| test_resize_embeddings = False | ||
| test_head_masking = False | ||
| test_torch_exportable = True | ||
| test_torch_exportable = False # broken by output recording refactor |
There was a problem hiding this comment.
Indeed, we can leave this as a todo, but we can also weed out the part to have the decorator support export. The thing that is not exportable should be easy to remove to force output_hidden_states for example!
A TODO is fine, let's add # FIXME:
ArthurZucker
left a comment
There was a problem hiding this comment.
Thanks a lot for adressing my comments and cleanup this all up!
| pixel_values, output_hidden_states=True, **kwargs | ||
| ) | ||
| embedding_output = self.embeddings(pixel_values) | ||
| output: BaseModelOutput = self.encoder(embedding_output, output_hidden_states=True) |
There was a problem hiding this comment.
I am not seeing (form the diff at least) a place where this would be false! if false is never an option we need to hardcode allways returning them!
There was a problem hiding this comment.
yeah, the idea is always to return them from the encoder (to make feature maps), but not always in the model output, so there are 2 options
- feature_maps and hidden_states
- only feature_maps (if output_hidden_states=False, in this case, we still need
hidden_statesfrom the encoder to buildfeature_maps, but we do not put them inModelOutput. Not sure if there are any benefits, but the tests are designed this way.)
|
[For maintainers] Suggested jobs to run (before merge) run-slow: audio_spectrogram_transformer, bit, convnext, convnextv2, deepseek_vl_hybrid, deit, depth_anything, depth_pro, dinov2, dinov2_with_registers, dpt, eomt, focalnet, got_ocr2, hgnet_v2, ijepa |
What does this PR do?
Refactor ViT and dependent models to use
@check_model_inputsand@can_return_tupledecorator to remove all the logic for intermediatehidden_statesandattentionscapture