Update pin memory related APIs to not pass 'device' argument by wizzniu · Pull Request #131858 · pytorch/pytorch

wizzniu · 2024-07-26T02:32:29Z

Based on #126376, this PR tries to update all PT callers (e.g., Tensor.is_pinned(), Tensor.pin_memory()) to not pass device argument.
As for storage/untyped_storage.is_pinned()/pin_memory(), we keep the device argument but passing device is discouraged. And if not given, the default device is still 'cuda' for BC.
Additionally, based on device-agnostic pin_memory, pin_memory_device argument of torch.utils.data.DataLoader is discouraged now. For BC, explictly passing this argument is still effective. If not given, the default device will be the current accelerator.

Fixes #124908
Relates #126376

cc: @albanD

cc @XilunWu @H-Huang @awgu @kwen2501 @wanchaol @fegin @fduwjj @wz337 @wconstab @d4l3k @c-p-i-o

pytorch-bot · 2024-07-26T02:32:32Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/131858

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure

As of commit ee437e8 with merge base 8c2aa0c ():

NEW FAILURE - The following job has failed:

pull / linux-jammy-py3-clang12-executorch / test (executorch, 1, 1, lf.linux.2xlarge) (gh)
Process completed with exit code 1.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

guangyey · 2024-07-26T07:36:41Z

Could we keep BC with the old behavior?

if a device type is passed, then explicitly use the passed device type;
if a None(by default) is passed, the current accelerator type will be used.

wizzniu · 2024-07-31T10:22:44Z

@guangyey We keep BC with the old behavior now. And it will not be broken until two or three release versions later.

As what alban said, we want to redefine pin memory as: "Pin memory is always pinned for the current accelerator device".. So the final result we hope is to drop device arg or pin_memory_device arg.

WDYT, @albanD ?

wizzniu · 2024-08-09T07:01:56Z

@albanD Could you help to review this? Or give some suggestions?

wizzniu · 2024-08-22T03:00:42Z

just avoid being stale.

wizzniu · 2024-09-10T09:39:58Z

@pytorchbot rebase

pytorchmergebot · 2024-09-10T09:41:31Z

@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here

pytorchmergebot · 2024-09-10T09:41:35Z

Successfully rebased update_pin_memory_api onto refs/remotes/origin/viable/strict, please pull locally before adding more changes (for example, via git checkout update_pin_memory_api && git pull --rebase)

wizzniu · 2024-09-30T06:35:06Z

@pytorchbot rebase

pytorchmergebot · 2024-09-30T06:36:34Z

@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here

pytorchmergebot · 2024-09-30T06:36:37Z

Successfully rebased update_pin_memory_api onto refs/remotes/origin/viable/strict, please pull locally before adding more changes (for example, via git checkout update_pin_memory_api && git pull --rebase)

albanD

FYI @andrewkho for the dataloader changes

albanD · 2024-10-02T20:47:40Z

torch/_tensor_docs.py

This can say that the default device is the current accelerator and link to https://pytorch.org/docs/main/torch.html#accelerators

albanD · 2024-10-02T20:47:47Z

torch/_tensor_docs.py

Same as above

albanD · 2024-10-02T20:49:38Z

torch/storage.py

I'm not sure we actually want to remove this argument altogether. I think saying in the doc that using this argument is discouraged is enough in the short term?

Adding these warnings here is going to be very very spammy as all existing users of these APIs will now start to see them and we don't want that.

Yes，too many warnings are annoying. But now, the default device of storage.is_pinned/pin_memory is 'cuda', not the current accelerator, is it acceptable?

albanD · 2024-10-02T20:52:03Z

torch/storage.py

For all versions of this pattern, I would just keep the code shared and pass in device as-is. The c++ side should be able to properly handle the device argument being "None" and detect that as not provided.

That will significantly reduce the code duplication

albanD · 2024-10-02T20:53:45Z

torch/utils/data/dataloader.py

Same, let's say discouraged with accelerator being the default

andrewkho

Dataloader side LGTM, we'll need to cherry-pick this to pytorch/data as well for StatefulDataLoader but we can take care of that. Will let @albanD give final stamp

andrewkho · 2024-10-02T23:37:21Z

torch/utils/data/_utils/pin_memory.py

Dumb question but I'm assuming this call has less overhead than what it's replacing @albanD , it calls in here: https://github.com/pytorch/pytorch/blob/main/torch/csrc/Module.cpp#L2130

@andrewkho Updated. Now it only be called when pin_memory_device argument isn't passed.
Another question: Is it necessary to set device here? From my point of view, pin memory's behavior is in host memory, so it shouldn't be related to id of device? @albanD

andrewkho · 2024-10-02T23:39:19Z

torch/utils/data/_utils/pin_memory.py

We might have to update torchdata's StatefulDataLoader if it's calling this function directly (which is totally fine), but unsure if downstream users are calling this, in which case it'd be BC breaking

Veyr good catch, yeah we need to keep this function around as it is actually used in a few places in the wild: https://github.com/search?q=%2Fdata%5C._utils.pin_memory%2F&type=code

Restore it.

albanD

Looks great!
Thanks for making all these edits!

CI needs fixing of course ;)

albanD · 2024-10-09T20:54:52Z

test/test_cpp_extensions_open_device_registration.py

        # Check if the pin_memory is functioning properly on custom device
        cpu_tensor = torch.empty(3)
        self.assertFalse(cpu_tensor.is_foo)
-        self.assertFalse(cpu_tensor.is_pinned("foo"))


I guess we're missing some privateuse1 initialization here?

Actually not, it has been done in setUpClass() method. The error is first caused by L347 and failed again at L336 when retrying to run this case. We restore the default argument device of storage.is_pinned/pin_memory to 'cuda', so have to add "foo" argument back here.
As for another failed case in TestDataLoader, it's because cuda (and other backends) doesn't support setCurrentDevice() now when calling torch._C._accelerator_hooks_set_current_device(device_id).We must get #131854 merged first, which blocks this PR.

@albanD Have a look at PR #131854 first?

wizzniu · 2024-12-26T10:37:01Z

@albanD Fix the failed case for MPS, please help to review it.
The reason is that DataLoader disables pin memory when pin_memory=True, pin_memory_device="" and cuda is unavailable in previous, but now it enables pin memory when cuda is unavailable but mps is available. I guess that pin_memory is not supported on MPS? So I just fix this case using pin_memory=False for MPS only.

wizzniu · 2025-01-02T03:13:54Z

@pytorchbot merge -r

pytorchmergebot · 2025-01-02T03:15:27Z

@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here

pytorchmergebot · 2025-01-02T03:15:30Z

Successfully rebased update_pin_memory_api onto refs/remotes/origin/viable/strict, please pull locally before adding more changes (for example, via git checkout update_pin_memory_api && git pull --rebase)

pytorchmergebot · 2025-01-02T03:16:47Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

pytorchmergebot · 2025-01-02T03:16:58Z

Merge failed

Reason: 3 mandatory check(s) failed. The first few are:

Dig deeper by viewing the failures on hud

Details for Dev Infra team

Raised by workflow job

Failing merge rule: Core Maintainers

wizzniu · 2025-01-02T03:59:41Z

test/test_dataloader.py

        batch_size=16,
        num_workers=2,
-        pin_memory=True,
+        pin_memory=True if not torch.backends.mps.is_available() else False,


@albanD @kulinseth
Since mps doesn't support pin_memory, do you think we should make pin_memory be False in Dataloader forcely for mps? Or just make a small change in testing code here is enough?

Sorry for the delay here, yes we should preserve the current behavior to avoid doing any BC-breaking change in this PR.
If this flag used to be ignored on MPS, let's preserve that for now.

@albanD We're going to change the logic from "self._pin_memory = loader.pin_memory and torch.cuda.is_available()" to "self._pin_memory = loader.pin_memory and torch.accelerator.is_available()", which would be BC-breaking for other non-cuda accelerators.
In my opinion, this is what we want because we expect to align the behavior with cuda in DataLoader for other non-cuda accelerators. And to tackle this issue, we have to make this BC-breaking change for other accelerators.
The existing problem is that for MPS, it would throw error when calling .pin_memory() though it has implement code for pin memory related interfaces. So it's special for MPS...
Forcely keep BC for MPS is easy to ensure, but maybe a little inelegant. I just want to confirm whether that's exactly what you mean.

Yes that matches my expectation.
Fixing the pinned memory allocation for MPS would be the best thing to do, but I don't think it is on you to do that unless you want to. So doing the inelegant thing on MPS to unblock this PR (and make all other accelerator work properly) is an acceptable tradeoff in my mind.

We can fix MPS later once the pinned memory allocation is fixed there.

@albanD Fine. Added restrictions for MPS currently, commenting the reason as well.

Co-authored-by: albanD <desmaison.alban@gmail.com>

albanD

Perfect!
FYI @kulinseth and @malfet we're keeping dataloader pinned memory a noop here, we should fix that when pinned memory is fixed on MPS.

albanD · 2025-01-15T16:33:00Z

@pytorchbot merge -i

Thanks!

pytorchmergebot · 2025-01-15T16:34:45Z

Merge started

Your change will be merged while ignoring the following 1 checks: pull / linux-jammy-py3-clang12-executorch / test (executorch, 1, 1, lf.linux.2xlarge)

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

Following [ #131858 suggestion](#131858 (review)) to optimize DataLoader code Pull Request resolved: #146821 Approved by: https://github.com/divyanshk Co-authored-by: Divyansh Khanna <divyanshkhanna09@gmail.com>

wizzniu requested review from andrewkho, gokulavasan and mikaylagawarecki as code owners July 26, 2024 02:32

pytorch-bot bot added oncall: distributed Add this issue/PR to distributed oncall triage queue release notes: dataloader release notes category labels Jul 26, 2024

pytorchbot added the open source label Jul 26, 2024

mikaylagawarecki requested a review from albanD July 26, 2024 15:51

janeyx99 added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Jul 29, 2024

pytorchmergebot force-pushed the update_pin_memory_api branch from b37dcbc to f3cc153 Compare September 10, 2024 09:41

pytorchmergebot force-pushed the update_pin_memory_api branch from f3cc153 to b861a45 Compare September 30, 2024 06:36

albanD reviewed Oct 2, 2024

View reviewed changes

andrewkho reviewed Oct 2, 2024

View reviewed changes

wizzniu force-pushed the update_pin_memory_api branch from b861a45 to 4d09bfd Compare October 9, 2024 11:03

wizzniu requested a review from divyanshk as a code owner October 9, 2024 11:03

wizzniu force-pushed the update_pin_memory_api branch from 4d09bfd to 102d5d2 Compare October 9, 2024 11:09

wizzniu requested a review from albanD October 9, 2024 11:11

albanD approved these changes Oct 9, 2024

View reviewed changes

albanD reviewed Oct 9, 2024

View reviewed changes

pytorch-bot bot added the ciflow/inductor label Oct 10, 2024

pytorchmergebot force-pushed the update_pin_memory_api branch from 2ff588d to 0c1c989 Compare January 2, 2025 03:15

pytorchmergebot added the merging label Jan 2, 2025

pytorchmergebot removed the merging label Jan 2, 2025

wizzniu commented Jan 2, 2025

View reviewed changes

wizzniu and others added 8 commits January 15, 2025 11:53

Update pin memory related APIs to not pass 'device' argument

005c967

Update

be70bd7

Fix open device storage UT

2446ecf

Use torch.accelerator.current_device_idx/set_device_idx in DataLoader

2c96cff

Update torch/utils/data/_utils/pin_memory.py

1002ed2

Co-authored-by: albanD <desmaison.alban@gmail.com>

Update to use torch.accelerator.current_device_index()

f3dad0a

Update test/test_dataloader.py for fixing mps case

1bec1fa

Add restriction for MPS

f3dc4fc

wizzniu force-pushed the update_pin_memory_api branch from 0c1c989 to f3dc4fc Compare January 15, 2025 07:17

restore case in test/test_dataloader.py

ee437e8

albanD approved these changes Jan 15, 2025

View reviewed changes

pytorchmergebot added the merging label Jan 15, 2025

pytorchmergebot added the Merged label Jan 15, 2025

pytorchmergebot closed this in c07dc64 Jan 15, 2025

pytorchmergebot removed the merging label Jan 15, 2025

zeshengzong mentioned this pull request Feb 10, 2025

Deprecate DataLoader pin_memory_device param #146821

Closed

Conversation

wizzniu commented Jul 26, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Jul 26, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/131858

❌ 1 New Failure

Uh oh!

guangyey commented Jul 26, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

wizzniu commented Jul 31, 2024

Uh oh!

wizzniu commented Aug 9, 2024

Uh oh!

wizzniu commented Aug 22, 2024

Uh oh!

wizzniu commented Sep 10, 2024

Uh oh!

pytorchmergebot commented Sep 10, 2024

Uh oh!

pytorchmergebot commented Sep 10, 2024

Uh oh!

wizzniu commented Sep 30, 2024

Uh oh!

pytorchmergebot commented Sep 30, 2024

Uh oh!

pytorchmergebot commented Sep 30, 2024

Uh oh!

albanD left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

andrewkho left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

albanD left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

wizzniu commented Dec 26, 2024

Uh oh!

wizzniu commented Jan 2, 2025

Uh oh!

pytorchmergebot commented Jan 2, 2025

Uh oh!

pytorchmergebot commented Jan 2, 2025

Uh oh!

pytorchmergebot commented Jan 2, 2025

wizzniu commented Jul 26, 2024 •

edited

Loading

pytorch-bot bot commented Jul 26, 2024 •

edited

Loading

guangyey commented Jul 26, 2024 •

edited

Loading

albanD left a comment •

edited

Loading