Fail loading pretrained weights for  Dinov2ForImageClassification model 

### System Info

both on:
transformers             4.32.0
transformers             4.34.0.dev0


### Who can help?

_No response_

### Information

- [X] The official example scripts
- [ ] My own modified scripts

### Tasks

- [X] An officially supported task in the `examples` folder (such as GLUE/SQuAD, ...)
- [ ] My own task or dataset (give details below)

### Reproduction

I've encountered a bug with the Dinov2ForImageClassification model from Hugging Face Transformers. As per the provided documentation [here](https://huggingface.co/docs/transformers/main/model_doc/dinov2#transformers.Dinov2ForImageClassification), I've followed the code example using the latest Transformers version. However, when running the code, I encounter an error indicating that the model is performing binary classification instead of the expected ImageNet 1000-way classification.

Here's my code:
```
from transformers import AutoImageProcessor, Dinov2ForImageClassification
import torch
from datasets import load_dataset

# Load a sample image dataset (in this case, 'huggingface/cats-image')
dataset = load_dataset('huggingface/cats-image')
image = dataset['test']['image'][0]

# Load the image processor and the Dinov2ForImageClassification model
image_processor = AutoImageProcessor.from_pretrained('facebook/dinov2-base')
model = Dinov2ForImageClassification.from_pretrained('facebook/dinov2-base')

# Prepare the input and obtain logits
inputs = image_processor(image, return_tensors='pt')
with torch.no_grad():
    logits = model(**inputs).logits

# The expected number of labels for ImageNet classification should be 1000
predicted_label = logits.argmax(-1).item()
```

Regardless of whether I specify num_labels=1000 during model initialization to correct the label dimensions, the following error persists:
```
Some weights of Dinov2ForImageClassification were not initialized from the model checkpoint at facebook/dinov2-base and are newly initialized: ['classifier.bias', 'classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
```

The issue persists, and I'm unable to utilize the pretrained Dinov2ForImageClassification model for ImageNet 1000-way classification as intended.

### Expected behavior

loading without warning, having 1000-way long output vector, that is representing the correct classification labels of ImageNet.

see more here:
https://discuss.huggingface.co/t/dino2-for-classification-has-wrong-number-of-labels/55027

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fail loading pretrained weights for Dinov2ForImageClassification model #26167

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Fail loading pretrained weights for Dinov2ForImageClassification model #26167

Description

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions