Skip to content

Dinov2 for depth estimation #26057

@rfan-debug

Description

@rfan-debug

Feature request

Dinov2's original repo has an example using Dinov2 backbone + DPT head for depth estimation notebook link. If we can integrate it into transformers repo by adding a class Dinov2ForImageDepthEstimation and let forward method return DepthEstimatorOutput, we'll have a unified output interface across all depth estimation models. By doing this, we can easily chain this powerful depth estimation method together with other models under transformers's pipelines.

Motivation

This would be a very great feature for many production use cases or research problems. One example is camera angle estimation from a 2D image, in which reliable depth information are critical. In my limited test cases, using dinov2+DPT head to run depth estimation is way better than the existing DPT model itself.

Your contribution

I can submit a PR to add this feature if other professional developers don't have the bandwidth to deal with it. (I am relatively new to transformers's develop workflow though.)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions