Faster implementation of torchvision.ops.boxes::masks_to_boxes by atharvas · Pull Request #8194 · pytorch/vision

atharvas · 2024-01-03T07:35:48Z

Quoting #8184

🚀 The feature

torchvision.ops.boxes::masks_to_boxes is used to convert a batch of binary 2D image masks to a set of bounding boxes. Essentially, for a mask of shape $(B, 64, 64)$, masks_to_boxes allocates a tensor of shape $(B, 4)$ and calculates the bounding box for each element of the batch sequentially. This is $O(B)$ storage and $O(B)$ runtime and the simplest implementation possible.

This proposal pertains to creating a faster and more general version of this function.

Profiling

Some primitive performance benchmarking validates this hypothesis. Profiling code is here. Profiling was done on an Apple M2 Pro.

Memory Profiling

Speed Profiling

Correctness

The behavior of the function is unchanged. There is a single test case for testing masks_to_boxes (test.test_ops:test_masks_box). The new implementation passes this test case.

$ pytest test/test_ops.py -vvv -k test_masks_box
============================================= test session starts =============================================
platform darwin -- Python 3.12.0, pytest-7.4.4, pluggy-1.3.0 -- /Users/atharvas/miniconda3/envs/torchvision/bin/python
cachedir: .pytest_cache
rootdir: /Users/atharvas/Desktop/projects/000_uncategorized/vision
configfile: pytest.ini
plugins: mock-3.12.0
collected 1440 items / 1439 deselected / 1 selected                                                           

test/test_ops.py::TestMasksToBoxes::test_masks_box PASSED                                               [100%]

============================================== warnings summary ===============================================
../../../../miniconda3/envs/torchvision/lib/python3.12/site-packages/torch/jit/frontend.py:1209
  /Users/atharvas/miniconda3/envs/torchvision/lib/python3.12/site-packages/torch/jit/frontend.py:1209: DeprecationWarning: ast.Str is deprecated and will be removed in Python 3.14; use ast.Constant instead
    elif isinstance(value, ast.Str):

../../../../miniconda3/envs/torchvision/lib/python3.12/site-packages/torch/jit/frontend.py:1210
  /Users/atharvas/miniconda3/envs/torchvision/lib/python3.12/site-packages/torch/jit/frontend.py:1210: DeprecationWarning: Attribute s is deprecated and will be removed in Python 3.14; use value instead
    s += value.s

test/test_ops.py::TestMasksToBoxes::test_masks_box
  /Users/atharvas/miniconda3/envs/torchvision/lib/python3.12/site-packages/_pytest/python.py:194: ResourceWarning: unclosed file <_io.BufferedReader name='/Users/atharvas/Desktop/projects/000_uncategorized/vision/test/assets/masks.tiff'>
    result = testfunction(**testargs)
  Enable tracemalloc to get traceback where the object was allocated.
  See https://docs.pytest.org/en/stable/how-to/capture-warnings.html#resource-warnings for more info.

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
=============================== 1 passed, 1439 deselected, 3 warnings in 0.86s ================================

pytorch-bot · 2024-01-03T07:35:52Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/vision/8194

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit dcc6e2e with merge base 6acedba ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

facebook-github-bot · 2024-01-03T07:35:54Z

Hi @atharvas!

Thank you for your pull request and welcome to our community.

Action Required

In order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you.

Process

In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.

If you have received this in error or have any questions, please contact us at cla@meta.com. Thanks!

facebook-github-bot · 2024-01-03T09:08:57Z

Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Meta Open Source project. Thanks!

NicolasHug · 2024-01-15T14:47:51Z

Thanks for the PR @atharvas

The test failure in https://github.com/pytorch/vision/actions/runs/7394640746/job/20496609120?pr=8194 seems relevant, as SimpleCopyPaste internally relies on masks_to_boxes.

But beyond the correctness issue, I am wondering whether such a change would bring a critical improvement. Saving 1MB isn't really a problem, and the time performance benefits are unclear when batch size is reasonable (i.e. < 1024?). Considering the proposed code is a lot more complex than the previous one, we should only be comfortable merging this PR if it could remove a known bottleneck. Did you find that masks_to_boxes performance was problematic in your own training runs?

The X and Y dimensions were flipped.

atharvas · 2024-01-15T16:54:59Z

Hi! Thanks for the update. Just fixed the error after recreating it on a linux box. Sorry about that!

(torchvision) asehgal@ubuntu:~/vision$ pytest test/test_ops.py -vvv -k test_masks_box
================================================== test session starts ==================================================
platform linux -- Python 3.12.0, pytest-7.4.0, pluggy-1.0.0 -- /home/asehgal/env/miniconda3/envs/torchvision/bin/python
cachedir: .pytest_cache
rootdir: /home/asehgal/vision
configfile: pytest.ini
plugins: mock-3.10.0
collected 1440 items / 1439 deselected / 1 selected                                                                     

test/test_ops.py::TestMasksToBoxes::test_masks_box PASSED                                                         [100%]

=================================================== warnings summary ====================================================
../env/miniconda3/envs/torchvision/lib/python3.12/site-packages/torch/jit/frontend.py:1209
  /home/asehgal/env/miniconda3/envs/torchvision/lib/python3.12/site-packages/torch/jit/frontend.py:1209: DeprecationWarning: ast.Str is deprecated and will be removed in Python 3.14; use ast.Constant instead
    elif isinstance(value, ast.Str):

../env/miniconda3/envs/torchvision/lib/python3.12/site-packages/torch/jit/frontend.py:1210
  /home/asehgal/env/miniconda3/envs/torchvision/lib/python3.12/site-packages/torch/jit/frontend.py:1210: DeprecationWarning: Attribute s is deprecated and will be removed in Python 3.14; use value instead
    s += value.s

test/test_ops.py::TestMasksToBoxes::test_masks_box
  /home/asehgal/env/miniconda3/envs/torchvision/lib/python3.12/site-packages/_pytest/python.py:194: ResourceWarning: unclosed file <_io.BufferedReader name='/home/asehgal/vision/test/assets/masks.tiff'>
    result = testfunction(**testargs)
  Enable tracemalloc to get traceback where the object was allocated.
  See https://docs.pytest.org/en/stable/how-to/capture-warnings.html#resource-warnings for more info.

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
==================================== 1 passed, 1439 deselected, 3 warnings in 1.76s =====================================
(torchvision) asehgal@ubuntu:~/vision$ pytest -s test/test_prototype_transforms.py
================================================== test session starts ==================================================
platform linux -- Python 3.12.0, pytest-7.4.0, pluggy-1.0.0
rootdir: /home/asehgal/vision
configfile: pytest.ini
plugins: mock-3.10.0
collected 24 items                                                                                                      

test/test_prototype_transforms.py ........................

================================================== 24 passed in 2.32s ===================================================
(torchvision) asehgal@ubuntu:~/vision$

About whether this is a critical improvement

For my use case yes, however, I'm not sure for the community at large. My codebase is private right now, but here's an example of this bottleneck "in the wild:"

This function processes a batch of B videos, each of trajectory length T, and containing N slots (slots=object masks, for what it's worth). Notice that the batch size is changed to B*T*N

Inside this function, vops.masks_to_boxes is called with the expectation it returns fast.

However, because the apparent batch size is actually B*T*N (In validation this is 64*64*7=28672 and in training it's 64*10*7=4480 per GPU) the operation ends up becoming a subtle bottleneck. Now, in my codebase, I need to call masks_to_boxes in every iteration of the training loop, which further increases the runtime.

NicolasHug · 2024-01-25T11:01:17Z

Thank you for the details @atharvas . That SGTM. Before merging, do you mind running the benchmark on CUDA as well to make sure there's no slow-down for GPUs? If that were to be the case, we could just have 2 paths (one for CPU, one for GPU)

atharvas · 2024-01-29T18:45:28Z

Hi!
Here are the results running on a GPU [updated script here]. This is an NVIDIA RTX 2080 Ti. I increased the max batch size to $2^{15}$.

Speed Comparison:

Peak Memory Comparison:

The second graph is interesting! It looks like the new implementation uses up more VRAM (in line with @pmeier's commend in #8184 ). I did an analysis of the GPU allocations using torch.cuda.memory_snapshot and each intermediate call to torch.any and torch.argmax allocates a $(B, 64)$ and $(B, 1)$ sized tensor on the GPU. The memory is reclaimed as soon as the function exits. Hence, the peak memory is higher for the new implementation, while the "active allocated" memory is the same as the original function.

@NicolasHug Do you think this warrants a separate pathway for CPU and GPU?

rehno-lindeque · 2024-08-30T21:10:31Z

If I'm not mistaken this new implementation will currently return a box covering the entire image in the special case that a mask is completely empty:

E.g.

>>> x1 = torch.tensor([[False,False,False]]).float().argmax(dim=1)
>>> x2 = (3 - 1) - torch.tensor([[False,False,False]]).float().flip(dims=[1]).argmax(dim=1)
>>> x1, x2
(tensor([0]), tensor([2]))

Currently pytorch throws this exception in this case:

RuntimeError: min(): Expected reduction dim to be specified for input.numel() == 0. Specify the reduction dim with the 'dim' argument.

(However, ideally I'd much prefer to get a degenerate box [0, 0, 0, 0], or [inf, inf, -inf, -inf], or [nan, nan, nan, nan] or anything like that which I can test for.)

Fixes pytorch#8184

8b0daac

atharvas mentioned this pull request Jan 3, 2024

Faster implementation of torchvision.ops.boxes::masks_to_boxes #8184

Open

atharvas changed the title ~~Fixes #8184~~ Faster implementation of torchvision.ops.boxes::masks_to_boxes Jan 3, 2024

facebook-github-bot added the cla signed label Jan 3, 2024

bmmtstb mentioned this pull request Jan 4, 2024

box_convert should accept values of BoundingBoxFormat as in- and output-format as well as case-insensitive string versions. #8190

Closed

Juphex approved these changes Jan 11, 2024

View reviewed changes

Update to pytorch#8194

a4e4278

The X and Y dimensions were flipped.

Merge branch 'main' into enhancement/masks_to_boxes

24fedf7

Merge branch 'main' into enhancement/masks_to_boxes

2a4f9ab

Merge branch 'pytorch:main' into enhancement/masks_to_boxes

b94dd58

This was referenced Jan 21, 2026

Fix masks_to_boxes for empty masks and vectorize for performance #9347

Closed

ops.masks_to_boxes , RuntimeError: min(): Expected reduction dim to be specified for input.numel() == 0. Specify the reduction dim with the 'dim' argument. #9346

Closed

zy1git added 2 commits January 22, 2026 02:11

Merge branch 'main' into enhancement/masks_to_boxes

8803bdd

Merge branch 'main' into enhancement/masks_to_boxes

dcc6e2e

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Faster implementation of torchvision.ops.boxes::masks_to_boxes#8194

Faster implementation of torchvision.ops.boxes::masks_to_boxes#8194
atharvas wants to merge 7 commits intopytorch:mainfrom
atharvas:enhancement/masks_to_boxes

atharvas commented Jan 3, 2024

Uh oh!

pytorch-bot bot commented Jan 3, 2024 •

edited

Loading

Uh oh!

facebook-github-bot commented Jan 3, 2024

Uh oh!

facebook-github-bot commented Jan 3, 2024

Uh oh!

NicolasHug commented Jan 15, 2024

Uh oh!

atharvas commented Jan 15, 2024 •

edited

Loading

Uh oh!

NicolasHug commented Jan 25, 2024

Uh oh!

atharvas commented Jan 29, 2024

Uh oh!

rehno-lindeque commented Aug 30, 2024 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Conversation

atharvas commented Jan 3, 2024

🚀 The feature

Profiling

Correctness

Uh oh!

pytorch-bot bot commented Jan 3, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/vision/8194

✅ No Failures

Uh oh!

facebook-github-bot commented Jan 3, 2024

Action Required

Process

Uh oh!

facebook-github-bot commented Jan 3, 2024

Uh oh!

NicolasHug commented Jan 15, 2024

Uh oh!

atharvas commented Jan 15, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

NicolasHug commented Jan 25, 2024

Uh oh!

atharvas commented Jan 29, 2024

Uh oh!

rehno-lindeque commented Aug 30, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

pytorch-bot bot commented Jan 3, 2024 •

edited

Loading

atharvas commented Jan 15, 2024 •

edited

Loading

rehno-lindeque commented Aug 30, 2024 •

edited

Loading