Fix test_remote_io.py due to mutating public s3 bucket#997
Fix test_remote_io.py due to mutating public s3 bucket#997ShiboXing wants to merge 9 commits intometa-pytorch:mainfrom
Conversation
ejguan
left a comment
There was a problem hiding this comment.
Do you mind removing this test?
https://github.com/pytorch/data/blob/e14927f32a6cf99658f72b765425bf032116f724/test/test_remote_io.py#L337-L343
It takes too much time as it would download the whole archives. We have used the test to validated S3FileLoader for an image.
ejguan
left a comment
There was a problem hiding this comment.
And, you don't need to rely on CI to execute those tests. You can run the test on your local env by
pytest -v test/test_remote_io.py -k test_fsspec_io_iterdatapipe
pytest -v test/test_remote_io.py -k test_s3_io_iterdatapipe
|
Hi @ejguan, can you start the checks again? Added the recursive option. |
…reset request in the dispatching process (#994) Summary: ### Changes - ~Fix S3 Tests~ in #997 - Ignore AttributeError for ReadingService when `gc` gets involved - Add a reset_iterator_counter to dispatching process to guarantee that the datapipe is reset when all loops have received the request. Otherwise, datapipe can be reset in the middle of iteration of the other loops. - Remove reference of the `thread` from `Prefetcher`. This would prevent racing condition when both `finally` in generator and `reset` function are accessing the same `thread`. Pull Request resolved: #994 Reviewed By: NivekT Differential Revision: D43094081 Pulled By: ejguan fbshipit-source-id: 3065bc6a994cc0ff5047d9b71970abc4c8d3e369
|
@ShiboXing Can you please do a |
2ab0120 to
576457b
Compare
|
@ejguan, rebased and force-pushed! |
|
@NivekT has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
ejguan
left a comment
There was a problem hiding this comment.
Thanks for your effort to make TochData' CI green again
) Summary: Please read through our [contribution guide](https://github.com/pytorch/data/blob/main/CONTRIBUTING.md) prior to creating your pull request. - Note that there is a section on requirements related to adding a new DataPipe. Fixes meta-pytorch#984 - Add a private function to TestDataPipeRemoteIO in test_remote_io.py to get s3 objects count label through aws cli - add awscli in requirements Pull Request resolved: meta-pytorch#997 Reviewed By: ejguan Differential Revision: D43157757 Pulled By: NivekT fbshipit-source-id: 7e9ee8299a28a087f88024c3b3e77be3bfe5adf0
) Summary: Please read through our [contribution guide](https://github.com/pytorch/data/blob/main/CONTRIBUTING.md) prior to creating your pull request. - Note that there is a section on requirements related to adding a new DataPipe. Fixes meta-pytorch#984 - Add a private function to TestDataPipeRemoteIO in test_remote_io.py to get s3 objects count label through aws cli - add awscli in requirements Pull Request resolved: meta-pytorch#997 Reviewed By: ejguan Differential Revision: D43157757 Pulled By: NivekT fbshipit-source-id: 7e9ee8299a28a087f88024c3b3e77be3bfe5adf0
) Summary: Please read through our [contribution guide](https://github.com/pytorch/data/blob/main/CONTRIBUTING.md) prior to creating your pull request. - Note that there is a section on requirements related to adding a new DataPipe. Fixes meta-pytorch#984 - Add a private function to TestDataPipeRemoteIO in test_remote_io.py to get s3 objects count label through aws cli - add awscli in requirements Pull Request resolved: meta-pytorch#997 Reviewed By: ejguan Differential Revision: D43157757 Pulled By: NivekT fbshipit-source-id: 7e9ee8299a28a087f88024c3b3e77be3bfe5adf0
Fix test_remote_io.py due to mutating public s3 bucket (meta-pytorch#997)
) Summary: Please read through our [contribution guide](https://github.com/pytorch/data/blob/main/CONTRIBUTING.md) prior to creating your pull request. - Note that there is a section on requirements related to adding a new DataPipe. Fixes meta-pytorch#984 - Add a private function to TestDataPipeRemoteIO in test_remote_io.py to get s3 objects count label through aws cli - add awscli in requirements Pull Request resolved: meta-pytorch#997 Reviewed By: ejguan Differential Revision: D43157757 Pulled By: NivekT fbshipit-source-id: 7e9ee8299a28a087f88024c3b3e77be3bfe5adf0
Please read through our contribution guide prior to
creating your pull request.
Fixes #984
Changes