Fix tests for S3

### 🐛 Describe the bug

S3 test is broken due to the the datasets have been updated in the public bucket, which we don't have any control.
Test case:
https://github.com/pytorch/data/blob/807db8f8c7282b2f48b48b1e07439c119a2ba12f/test/test_remote_io.py#L256-L291

And, we previously just fix the test by updating the number of files per bucket whenever the dataset update happened. It's not a long-term solution to maintain CI. To fix it, we might choose from the following solutions:
- Only validate some files existing in the output, not the total file count from each bucket
- Use mock to simulate the result
- Add our own stable bucket for testing

I prefer the first solution for two reasons:
- We want to test the functionality provided by `_torchdata.so`. Even though mocking the result of this extension would guarantee test green, it doesn't really cover the test over the extension.
- The third option might work but it also means our own bucket will be exposed on Github, which is not ideal IMHO.

### Versions

main

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix tests for S3 #984

🐛 Describe the bug

Versions

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Fix tests for S3 #984

Description

🐛 Describe the bug

Versions

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions