Skip to content

Adding Prefetcher into the PrototypeRS and fixing prefetcher bug#826

Closed
VitalyFedyunin wants to merge 2 commits intogh/VitalyFedyunin/28/basefrom
gh/VitalyFedyunin/28/head
Closed

Adding Prefetcher into the PrototypeRS and fixing prefetcher bug#826
VitalyFedyunin wants to merge 2 commits intogh/VitalyFedyunin/28/basefrom
gh/VitalyFedyunin/28/head

Conversation

@VitalyFedyunin
Copy link
Contributor

@VitalyFedyunin VitalyFedyunin commented Oct 12, 2022

Fixes: #815

Stack from ghstack (oldest at bottom):

Fixes #825

Differential Revision: D40308358

VitalyFedyunin added a commit that referenced this pull request Oct 12, 2022
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Oct 12, 2022
@VitalyFedyunin
Copy link
Contributor Author

@VitalyFedyunin has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Copy link
Contributor

@ejguan ejguan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The culprit was the same as I was thinking. But, I do have a few comments below

Comment on lines -105 to -107

def reset_iterator(self):
self.reset()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I found the same issue as reset will be called automatically at the beginning of __iter__.

self.combined_datapipes.reset_epoch()
if self.prefetch_mainloop > 0:
# Stop prefetching first
self.combined_datapipes.reset()
Copy link
Contributor

@ejguan ejguan Oct 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since reset will be called in the begining of the __iter__ as well, I think it's better to add a different API to stop prefetching and called here.

And, reset can be the last resort to stop thread if that API is not invoked.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH, we should have a graph function to stop prefetcher from the end of the pipeline to the beginning especially when multiple prefetchers are presented.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also found them annoyingly duplicating and ideally we need something like GraphStop operation to freeze all parallel work. It would be extremely useful for snapshotting. But it is out of scope for this PR (Anyway I will post separate issue to capture demand)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM! There are a few special DataPipes need to be handled specially with this GraphStop:

  • FullSync
  • Prefetcher

VitalyFedyunin added a commit that referenced this pull request Oct 18, 2022
@VitalyFedyunin
Copy link
Contributor Author

@VitalyFedyunin has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@ejguan ejguan mentioned this pull request Oct 18, 2022
ejguan pushed a commit to ejguan/data that referenced this pull request Oct 21, 2022
…a-pytorch#826)

Summary: Pull Request resolved: meta-pytorch#826

Test Plan: Imported from OSS

Reviewed By: NivekT, msaroufim

Differential Revision: D40308358

Pulled By: VitalyFedyunin

fbshipit-source-id: f4b706c29ceffc52c376e9cddfc56a75819ff1ee
ejguan pushed a commit that referenced this pull request Oct 21, 2022
Summary: Pull Request resolved: #826

Test Plan: Imported from OSS

Reviewed By: NivekT, msaroufim

Differential Revision: D40308358

Pulled By: VitalyFedyunin

fbshipit-source-id: f4b706c29ceffc52c376e9cddfc56a75819ff1ee
@facebook-github-bot facebook-github-bot deleted the gh/VitalyFedyunin/28/head branch October 22, 2022 14:20
ejguan pushed a commit to ejguan/data that referenced this pull request Oct 23, 2022
…a-pytorch#826)

Summary: Pull Request resolved: meta-pytorch#826

Test Plan: Imported from OSS

Reviewed By: NivekT, msaroufim

Differential Revision: D40308358

Pulled By: VitalyFedyunin

fbshipit-source-id: f4b706c29ceffc52c376e9cddfc56a75819ff1ee
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants