Skip to content

Allow MpDeviceLoader to shard dictionaries of tensors#7912

Merged
bhavya01 merged 4 commits intomasterfrom
mp_loader
Sep 4, 2024
Merged

Allow MpDeviceLoader to shard dictionaries of tensors#7912
bhavya01 merged 4 commits intomasterfrom
mp_loader

Conversation

@bhavya01
Copy link
Copy Markdown
Collaborator

No description provided.

@bhavya01 bhavya01 marked this pull request as ready for review August 26, 2024 16:45
@bhavya01 bhavya01 self-assigned this Aug 26, 2024
Comment thread test/test_mp_input_sharding.py Outdated
@JackCaoG
Copy link
Copy Markdown
Collaborator

please run both v2 and v4 TPUCI in this pr.

Comment thread test/test_mp_input_sharding.py
Comment thread test/test_mp_input_sharding.py Outdated
Comment thread torch_xla/distributed/parallel_loader.py Outdated
Comment on lines +203 to +205
return [
KeyError(f"Keys: {missing_keys} are missing from input_sharding.")
]
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not raise directly?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part of the code executes in a separate thread. If an exception happens in the thread, the main thread is just blocked on the _worker thread to load data into the input_batches queue. To explicitly raise the error to the user, I think it might be better if we add the exception to the input batch queue and check for it while passing the data to the main thread in PerDeviceLoader.next function

Copy link
Copy Markdown
Collaborator

@JackCaoG JackCaoG left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bhavya01 bhavya01 added the tpuci label Sep 4, 2024
Copy link
Copy Markdown
Collaborator

@JackCaoG JackCaoG left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should also mention this in the release note

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants