Using system generated seed in RandomSampler by ramanishsingh · Pull Request #1441 · meta-pytorch/data

ramanishsingh · 2025-02-06T00:12:45Z

Currently we are fixing the seed for generator in RandomSampler as 1.
This leads to the generator not changing even when torch.manual_seed() seed is changed.

For the RandomSampler in torch.utils.data.sampler, they use seed = int(torch.empty((), dtype=torch.int64).random_().item()). Using the same here.

Fixes #1440

pytorch-bot · 2025-02-06T00:32:39Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/data/1441

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit a5ec001 with merge base f15fd3a ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

divyanshk · 2025-02-11T04:54:56Z

test/stateful_dataloader/test_sampler.py

+        seed = 1
+        torch.manual_seed(seed)
+        dl3 = StatefulDataLoader(self.dataset, batch_size=1, shuffle=True)
+        data_dl3 = []


Can we call this results3 ? and similarly results1 and results2 above.

divyanshk · 2025-02-11T04:55:57Z

test/stateful_dataloader/test_sampler.py

+
+        seed = 1
+        torch.manual_seed(seed)
+        dl3 = StatefulDataLoader(self.dataset, batch_size=1, shuffle=True)


We can rename dl3 to dataloader3. ditto for other dataloader variables.

divyanshk · 2025-02-11T04:57:18Z

test/stateful_dataloader/test_sampler.py


+    def test_seed_replicability(self):
+
+        seed = 0


Instead of checking for specific seeds 0 and 1, we can generalize it to two randomly generated seeds.

And also add a assert to ensure both seeds are not equal.

divyanshk

LGTM!

andrewkho · 2025-02-11T20:00:50Z

torchdata/stateful_dataloader/sampler.py

+        generator=None,
    ):
        if generator is None:
            # Ensure that underlying sampler has something repeatable


let's remove or update this comment

andrewkho

Update/remove comment and then gogogo

* add new sampler tests * update seed generation in sampler * run precommit * update seed generation * change variable name * update comment * add seed to tests * run precommit

* Fix end of epoch StatefulDataLoader restart (#1439) * add test for end of epoch state dict check * run precommit update stateful_dataloader run precommit local changes update test to test the order of batches update test update tests revert changes in SDL revert changes in SDL update tests run precommit * update sampler * run precommit * remove unnecessary comment * add test for statedict before and after endofepoch * run precommit * check if _sampler_iter is exhausted * run precommit * remove commented lines * remove default values * only exhaust sampler_iter if present in sd * update _StatefulRandomSamplerIterator update state dict if the iterator has finished add comment about why were updating state dict run precommit * update randomsampleriter state_dict fully * run precommit * fork torch.utils.data RandomSampler reverse changes to sdl.py generator to iterator run precommit update generator usage * update class name * run precommit * add a method to generate permutations * update return type * update next logic * add comment * update tests to include non stateful samplers * add comments * Using system generated seed in RandomSampler (#1441) * add new sampler tests * update seed generation in sampler * run precommit * update seed generation * change variable name * update comment * add seed to tests * run precommit

ramanishsingh added 4 commits February 4, 2025 22:35

add new sampler tests

1e7c19c

update seed generation in sampler

d057024

run precommit

7d64806

update seed generation

8a34539

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 6, 2025

ramanishsingh marked this pull request as draft February 6, 2025 00:13

ramanishsingh marked this pull request as ready for review February 6, 2025 01:02

ramanishsingh requested a review from andrewkho February 6, 2025 01:02

divyanshk reviewed Feb 11, 2025

View reviewed changes

change variable name

14fa418

ramanishsingh requested a review from divyanshk February 11, 2025 06:31

divyanshk approved these changes Feb 11, 2025

View reviewed changes

ramanishsingh mentioned this pull request Feb 11, 2025

Fix end of epoch StatefulDataLoader restart #1439

Merged

andrewkho reviewed Feb 11, 2025

View reviewed changes

andrewkho approved these changes Feb 11, 2025

View reviewed changes

ramanishsingh and others added 4 commits February 13, 2025 16:49

update comment

b0f1792

Merge branch 'main' into fix_randomsampler_seed

23cee7b

add seed to tests

21f3273

run precommit

a5ec001

ramanishsingh merged commit 1277308 into main Feb 19, 2025
39 checks passed

ramanishsingh mentioned this pull request Feb 19, 2025

Cherry-picking changes in main to the release branch #1446

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Using system generated seed in RandomSampler#1441

Using system generated seed in RandomSampler#1441
ramanishsingh merged 9 commits intomainfrom
fix_randomsampler_seed

ramanishsingh commented Feb 6, 2025

Uh oh!

pytorch-bot bot commented Feb 6, 2025 •

edited

Loading

Uh oh!

divyanshk Feb 11, 2025

Uh oh!

divyanshk Feb 11, 2025

Uh oh!

divyanshk Feb 11, 2025

Uh oh!

divyanshk left a comment

Uh oh!

andrewkho Feb 11, 2025

Uh oh!

andrewkho left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

ramanishsingh commented Feb 6, 2025

Uh oh!

pytorch-bot bot commented Feb 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/data/1441

✅ No Failures

Uh oh!

divyanshk Feb 11, 2025

Choose a reason for hiding this comment

Uh oh!

divyanshk Feb 11, 2025

Choose a reason for hiding this comment

Uh oh!

divyanshk Feb 11, 2025

Choose a reason for hiding this comment

Uh oh!

divyanshk left a comment

Choose a reason for hiding this comment

Uh oh!

andrewkho Feb 11, 2025

Choose a reason for hiding this comment

Uh oh!

andrewkho left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

pytorch-bot bot commented Feb 6, 2025 •

edited

Loading