Fix: Make __len__ of datapipes dynamic#88302
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/88302
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit 9bb884b: This comment was automatically generated by Dr. CI and updates every 15 minutes. |
ejguan
left a comment
There was a problem hiding this comment.
Could you please fix the lint Error?
Should be done with 6b52d5714f94c63d15c53e0051f3266e8c237c60. |
|
@fgerzer Thank you. But, I can't merge the PR if the CLA is not signed. See: #88302 (comment) |
I'm aware of this; I'm currently getting this organized. |
|
/easycla |
|
@pytorchbot rebase |
|
@pytorchbot successfully started a rebase job. Check the current status here |
|
Successfully rebased |
6b52d57 to
96a2bcb
Compare
|
My apologies that the CLA's still taking time. I'm trying to accelerate it. |
|
CLA approval is done, and this is now good to go. |
|
@pytorchbot rebase |
|
@pytorchbot successfully started a rebase job. Check the current status here |
|
Successfully rebased |
96a2bcb to
9bb884b
Compare
|
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Fixes #88074
Several datapipes have their lengths cached on being executed for the first time. However, source datapipes might change in length (most prominently, whenever
apply_shardingis called). The behaviour is counter-intuitive because we do not expect__len__to have side-effects.This PR makes
__len__dynamically computed.Changes:
datapipesREADME that__len__should be dynamic and why.ConcaterIterDataPipe,MultiplexerIterDataPipe,ZipperIterDataPipe,BatcherIterDataPipe,ConcaterMapDataPipe, andBatcherMapDataPipe.lengthattribute in setstate/getstate ofMultiplexerIterDataPipe. I am unsure whether to remove this completely and risk breaking saved checkpoints (as I did) or whether to just ignore thelengthof the loadedstate.lengthattribute. I have found no uses of this, though.cc @ssnl @VitalyFedyunin @ejguan @NivekT