[DataPipe] Fix error message coming from single iterator constraint#79547
[DataPipe] Fix error message coming from single iterator constraint#79547NivekT wants to merge 2 commits intogh/nivekt/50/basefrom
Conversation
[ghstack-poisoned]
🔗 Helpful links
❌ 1 New FailuresAs of commit 1a00d1f (more details on the Dr. CI page): Expand to see more
🕵️ 1 new failure recognized by patternsThe following CI failures do not appear to be due to upstream breakages
|
| full_msg = f"{msg} {datapipe.__class__.__name__}({_generate_input_args_string(datapipe)})" | ||
| if len(e.args) >= 1 and msg not in e.args[0]: | ||
| single_iterator_msg = "single iterator per IterDataPipe constraint" | ||
| has_len = hasattr(e.args, '__len__') and len(e.args) >= 1 |
There was a problem hiding this comment.
Noob question: What would be the cause of this Error with __len__? Someone calling len(dp) in the middle of iteration?
There was a problem hiding this comment.
The extra logic here is not related to the DataPipe,but rather, the exception being raised.
It is checking if e.args (arguments of the exception) has __len__ and if it does, there is more than one element. I have seen cases where e.args does not mean that criteria.
There was a problem hiding this comment.
I see. Thanks for the explanation.
There was a problem hiding this comment.
Your question prompted me to take another look and handle the case where len(e.args) == 0 when there is no exception message. Have a look.
There was a problem hiding this comment.
class MyDataPipe(IterDataPipe):
def __init__(self, x):
self.x = x
def __iter__(self):
raise RuntimeError
yield 0
dp = MyDataPipe(1)
it = iter(dp)
next(it)Previously, it will just show:
Traceback (most recent call last):
File "/Users/scratch/graphvisualization.py", line 27, in <module>
next(it)
File "/Users/pytorch/torch/utils/data/datapipes/_typing.py", line 514, in wrap_generator
response = gen.send(None)
File "/Users/scratch/graphvisualization.py", line 21, in __iter__
raise RuntimeError
RuntimeError:
Now it adds the extra message in the end:
This exception is thrown by __iter__ of MyDataPipe(x=1)
There was a problem hiding this comment.
Makes sense. I hope at least TorchData doesn't have such kind of Error without message. lol
| full_msg = f"{msg} {datapipe.__class__.__name__}({_generate_input_args_string(datapipe)})" | ||
| if len(e.args) >= 1 and msg not in e.args[0]: | ||
| single_iterator_msg = "single iterator per IterDataPipe constraint" | ||
| has_len = hasattr(e.args, '__len__') and len(e.args) >= 1 |
There was a problem hiding this comment.
I see. Thanks for the explanation.
…onstraint" Fixes meta-pytorch/data#516 * Prevent confusion by not adding the extra string about `__iter__` when the exception is raised because of single iterator constraint * Added a missing space * Make sure `__len__` exists before calling it during exception handling Exception example: ```python from torchdata.datapipes.iter import IterableWrapper source_dp = IterableWrapper(range(10)) it1 = iter(source_dp) next(it1) it2 = iter(source_dp) next(it2) next(it1) ``` Will raise: ``` Traceback (most recent call last): File "/Users/scratch/fast_forward.py", line 32, in <module> next(it1) File "/Users/pytorch/torch/utils/data/datapipes/_typing.py", line 524, in wrap_generator _check_iterator_valid(datapipe, iterator_id) File "/Users/pytorch/torch/utils/data/datapipes/_typing.py", line 445, in _check_iterator_valid raise RuntimeError(_gen_invalid_iterdatapipe_msg(datapipe) + _feedback_msg) RuntimeError: This iterator has been invalidated because another iterator has been created from the same IterDataPipe: IterableWrapperIterDataPipe(deepcopy=True, iterable=range(0, 10)) This may be caused multiple references to the same IterDataPipe. We recommend using `.fork()` if that is necessary. For feedback regarding this single iterator per IterDataPipe constraint, feel free to comment on this issue: meta-pytorch/data#45. ``` [ghstack-poisoned]
|
@pytorchbot merge |
|
@pytorchbot successfully started a merge job. Check the current status here |
…79547) Summary: Pull Request resolved: #79547 Approved by: https://github.com/ejguan Test Plan: contbuild & OSS CI, see https://hud.pytorch.org/commit/pytorch/pytorch/22c7b1ddb593ca390e58c5b94f8ad10d21ed0f73 Reviewed By: malfet Differential Revision: D37157039 Pulled By: NivekT fbshipit-source-id: 36d5b68375fc1ef4e3326525003fcfbd46252124
Stack from ghstack:
Fixes meta-pytorch/data#516
__iter__when the exception is raised because of single iterator constraint__len__exists before calling it during exception handlingException example:
Will raise: