Skip to content

[Feature Request] Datasets Should Use New torchvision.io Image Loader APIs and Return TVTensor Images by Default #8762

@fang-d

Description

@fang-d

🚀 The feature

  1. Add "torchvision" image loader backend based on new torchvision.io APIs (See: Release Notes v0.20) and enable it by default.
  2. VisionDatasets should return TVTensor images by default instead of PIL.Image.

Motivation, pitch

  1. TorchVision v0.20 introduces new torchvision.io APIs that enhance its encoding/decoding capabilities.
  2. Current VisionDatasets returns PIL.Image by default, but the first step of transforms is usually transforms.ToImage().
  3. PIL is slow (See: Pillow-SIMD), especially when compared with new torchvision.io APIs.
  4. Current TorchVision image loader backends are based on PIL or accimage, not including new torchvision.io APIs.

Alternatives

  1. The return type of datasets can be PIL.Image when using the PIL or the accimage backends, and be TVTensor if using new APIs (may lose consistency).

Additional context

I would like to make a pull request if the community likes this feature.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions