Skip to content

[BUG] Swin Transformer relative_position_index not initialized when loaded via transformers TimmBackbone #2661

@CharlesCNorton

Description

@CharlesCNorton

Describe the bug

In timm 1.0.24, the relative_position_index buffer in WindowAttention is not initialized when the model is loaded via HuggingFace transformers' TimmBackbone. This causes IndexError during inference.

To Reproduce

from transformers import AutoModelForZeroShotObjectDetection

model = AutoModelForZeroShotObjectDetection.from_pretrained('omlab/omdet-turbo-swin-tiny-hf')
backbone = model.vision_backbone.vision_backbone._backbone

for name, module in backbone.named_modules():
    if hasattr(module, 'relative_position_index'):
        idx = module.relative_position_index
        print(f'min: {idx.min().item()}, max: {idx.max().item()}')
        break

timm 1.0.22: min: 0, max: 168
timm 1.0.24: min: 0, max: 2717338685440 ✗ (uninitialized garbage)

Running inference produces:

IndexError: index 1841 is out of bounds for dimension 0 with size 169

Root Cause

PR #2632 changed WindowAttention to register an empty buffer:

self.register_buffer("relative_position_index", torch.empty(...), persistent=False)

The buffer is only populated if reset_parameters() is called, which requires not self.relative_position_bias_table.is_meta. When transformers loads the model via TimmBackbone, this condition doesn't trigger, leaving the buffer uninitialized.

Workaround

Manually calling init_non_persistent_buffers() after loading fixes the issue:

for name, module in backbone.named_modules():
    if hasattr(module, 'init_non_persistent_buffers'):
        module.init_non_persistent_buffers()

After this call, relative_position_index has correct values (max=168) and inference succeeds.

Expected behavior

relative_position_index should be properly initialized regardless of how the model is loaded.

Environment

  • timm: 1.0.24
  • transformers: 4.57.6
  • torch: 2.10.0
  • Python: 3.11

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions