Skip to content

SwinTransformer as encoder and Bart as decoder #15526

@RishabhMaheshwary

Description

@RishabhMaheshwary

Environment info

  • transformers version: 4.16.2
  • Platform: Linux-5.4.0-81-generic-x86_64-with-glibc2.10
  • Python version: 3.8.8
  • PyTorch version (GPU?): 1.10.2+cu102 (True)
  • Tensorflow version (GPU?): not installed (NA)
  • Flax version (CPU?/GPU?/TPU?): not installed (NA)
  • Jax version: not installed
  • JaxLib version: not installed
  • Using GPU in script?: yes
  • Using distributed or parallel set-up in script?: yes

Who can help

@NielsRogge

Information

I wanted to use an encoder decoder model with SwinTransformer as an encoder and bart-large as a decoder.
I used VisionEncoderDecoderModel.from_encoder_decoder_pretrained("microsoft/swin-base-patch4-window12-384", "facebook/bart-large") command and it results in the following.

  File "train_bart.py", line 158, in <module>
    model = VisionEncoderDecoderModel.from_encoder_decoder_pretrained("microsoft/swin-base-patch4-window12-384", "facebook/bart-large")
  File "/private/home/rbh/anaconda3/lib/python3.8/site-packages/transformers/models/vision_encoder_decoder/modeling_vision_encoder_decoder.py", line 399, in from_encoder_decoder_pretrained
    return cls(encoder=encoder, decoder=decoder, config=config)
  File "/private/home/rbh/anaconda3/lib/python3.8/site-packages/transformers/models/vision_encoder_decoder/modeling_vision_encoder_decoder.py", line 213, in __init__
    self.encoder.config.hidden_size != self.decoder.config.hidden_size
  File "/private/home/rbh/anaconda3/lib/python3.8/site-packages/transformers/configuration_utils.py", line 250, in __getattribute__
    return super().__getattribute__(key)
AttributeError: 'SwinConfig' object has no attribute 'hidden_size'

Can the SwinTransformer with a Bart decoder be initialized using VisionEncoderDecoderModel or do I need to write my own model with SwinTransformer as enocder and Bart as decoder ?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions