Skip to content

Faster implementation of torchvision.ops.boxes::masks_to_boxes #8184

@atharvas

Description

@atharvas

🚀 The feature

torchvision.ops.boxes::masks_to_boxes is used to convert a batch of binary 2D image masks to a set of bounding boxes. This proposal pertains to creating a faster and more general version of this function.

Motivation, pitch

The implementation of masks_to_boxes utilizes a for loop over the batch dimension to individually compute the bounding box for each mask, giving the function an $O(B)$ complexity and rendering it unsuitable for large batch sizes. Instead, we can recreate the function with intelligent usage of the in-build vectorized operations to achieve the same result at close to $O(1)$ complexity.

Some primitive performance benchmarking seems to validate this hypothesis.

masks_to_boxes

The proposed version is also more general, as it can be extended to 3D masks easily.

Alternatives

N/A

Additional context

I've already implemented this feature and have a pull request ready. However, I wish to understand if the core contributors believe this is a worthwhile feature or not.

### Tasks

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions