Skip to content

Fix: Make __len__ of datapipes dynamic#88302

Closed
fgerzer wants to merge 9 commits intopytorch:masterfrom
fgerzer:fix/dynamic_pipe_length
Closed

Fix: Make __len__ of datapipes dynamic#88302
fgerzer wants to merge 9 commits intopytorch:masterfrom
fgerzer:fix/dynamic_pipe_length

Conversation

@fgerzer
Copy link
Contributor

@fgerzer fgerzer commented Nov 2, 2022

Fixes #88074

Several datapipes have their lengths cached on being executed for the first time. However, source datapipes might change in length (most prominently, whenever apply_sharding is called). The behaviour is counter-intuitive because we do not expect __len__ to have side-effects.

This PR makes __len__ dynamically computed.

Changes:

  • Add note to the datapipes README that __len__ should be dynamic and why.
  • Remove caching of length computations in ConcaterIterDataPipe, MultiplexerIterDataPipe, ZipperIterDataPipe, BatcherIterDataPipe, ConcaterMapDataPipe, and BatcherMapDataPipe.
  • This required removal of the length attribute in setstate/getstate of MultiplexerIterDataPipe. I am unsure whether to remove this completely and risk breaking saved checkpoints (as I did) or whether to just ignore the length of the loaded state.
  • This also means the classes above no longer have a length attribute. I have found no uses of this, though.

cc @ssnl @VitalyFedyunin @ejguan @NivekT

@pytorch-bot pytorch-bot bot added the release notes: dataloader release notes category label Nov 2, 2022
@pytorch-bot
Copy link

pytorch-bot bot commented Nov 2, 2022

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/88302

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 9bb884b:
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented Nov 2, 2022

CLA Signed

The committers listed above are authorized under a signed CLA.

  • ✅ login: fgerzer / name: Frederik Gerzer (ba0d780778eb7fdef9bbfaaf95e394c0b8ab678b, 5376c7e7dd049fc482736d06854006c5ca2cfbfd, f111e81d0f4907540c091f3342a921a09a896074, 121c95945eb598d6036fdacda311d03cb68f1f44, b975699f8204808f1990d7491a93e02dd49bdc19, f580e86f2d522c6ffc15a5d09782819ba2b7f060, ce16d6c0f528cfec15d82ecf91ef0f8f8bf06442, ee4947445db367a62753ec1db56c66cb87d57cbf, 96a2bcbfd007ce150de884e8e4f3c090e827cd05)

@H-Huang H-Huang added module: dataloader Related to torch.utils.data.DataLoader and Sampler triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module labels Nov 2, 2022
Copy link
Contributor

@NivekT NivekT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! We also should check to see if any DataPipe implemented in the TorchData repo has the same issue.

@ejguan ejguan added the topic: bug fixes topic category label Nov 2, 2022
Copy link
Contributor

@ejguan ejguan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please fix the lint Error?

@fgerzer
Copy link
Contributor Author

fgerzer commented Nov 3, 2022

Could you please fix the lint Error?

Should be done with 6b52d5714f94c63d15c53e0051f3266e8c237c60.

@ejguan
Copy link
Contributor

ejguan commented Nov 3, 2022

@fgerzer Thank you. But, I can't merge the PR if the CLA is not signed. See: #88302 (comment)

@fgerzer
Copy link
Contributor Author

fgerzer commented Nov 3, 2022

@fgerzer Thank you. But, I can't merge the PR if the CLA is not signed. See: #88302 (comment)

I'm aware of this; I'm currently getting this organized.

@kit1980
Copy link
Contributor

kit1980 commented Nov 17, 2022

/easycla

@NivekT
Copy link
Contributor

NivekT commented Nov 28, 2022

@pytorchbot rebase

@pytorchmergebot
Copy link
Collaborator

@pytorchbot successfully started a rebase job. Check the current status here

@pytorchmergebot
Copy link
Collaborator

Successfully rebased fix/dynamic_pipe_length onto refs/remotes/origin/viable/strict, please pull locally before adding more changes (for example, via git checkout fix/dynamic_pipe_length && git pull --rebase)

@fgerzer
Copy link
Contributor Author

fgerzer commented Dec 1, 2022

My apologies that the CLA's still taking time. I'm trying to accelerate it.

@fgerzer
Copy link
Contributor Author

fgerzer commented Dec 9, 2022

CLA approval is done, and this is now good to go.

@NivekT
Copy link
Contributor

NivekT commented Dec 9, 2022

@pytorchbot rebase

@pytorchmergebot
Copy link
Collaborator

@pytorchbot successfully started a rebase job. Check the current status here

@pytorchmergebot
Copy link
Collaborator

Successfully rebased fix/dynamic_pipe_length onto refs/remotes/origin/viable/strict, please pull locally before adding more changes (for example, via git checkout fix/dynamic_pipe_length && git pull --rebase)

@pytorchmergebot pytorchmergebot force-pushed the fix/dynamic_pipe_length branch from 96a2bcb to 9bb884b Compare December 9, 2022 15:37
@NivekT
Copy link
Contributor

NivekT commented Dec 9, 2022

@pytorchbot merge

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Dec 9, 2022
@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/trunk Trigger trunk jobs on your pull request Merged module: dataloader Related to torch.utils.data.DataLoader and Sampler open source release notes: dataloader release notes category topic: bug fixes topic category triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Datapipes: Batcher and ShardingFilter interaction leads to wrong __len__

7 participants