Skip to content

Single key option for Slicer and doc improvements#1041

Closed
SvenDS9 wants to merge 6 commits intometa-pytorch:mainfrom
SvenDS9:docimprovements
Closed

Single key option for Slicer and doc improvements#1041
SvenDS9 wants to merge 6 commits intometa-pytorch:mainfrom
SvenDS9:docimprovements

Conversation

@SvenDS9
Copy link
Contributor

@SvenDS9 SvenDS9 commented Feb 23, 2023

Single key option for Slicer and doc improvements

Changes

  • Enable Slicer to also work for a single key + functional test
  • Fix typos in doc
  • Add laion-example to examples page

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 23, 2023
Copy link
Contributor

@ejguan ejguan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you

Comment on lines +298 to +303
elif self.index in old_item.keys():
new_item = {self.index: old_item.get(self.index)} # type: ignore[assignment]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH, dict is weird for slice.

@ejguan ejguan requested a review from NivekT February 27, 2023 14:57
@facebook-github-bot
Copy link
Contributor

@ejguan has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Copy link
Contributor

@NivekT NivekT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM #1050 is merging, please rebase after it merged. There are some duplicate fixes withinreading_service.rst

Dynamic sharding is achieved by ``MultiProcessingReadingService`` and ``DistributedReadingService`` to shard the pipeline based on the information of corresponding multiprocessing and distributed workers. And, TorchData offers two types of ``DataPipe`` letting users define the sharding place within the pipeline.

- ``sharding_filter``: When the pipeline is replicable, each distributed/multiprocessing worker loads data from one replica of the ``DataPipe`` graph, and skip the data not blonged to the corresponding worker at the place of ``sharding_filter``.
- ``sharding_filter``: When the pipeline is replicable, each distributed/multiprocessing worker loads data from one replica of the ``DataPipe`` graph, and skips the data not belonging to the corresponding worker at the place of ``sharding_filter``.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, I have a fix to this in a separate PR soon to land. Rebasing will be necessary

@NivekT NivekT mentioned this pull request Feb 27, 2023
10 tasks
@ejguan ejguan added this to the 0.6.0 milestone Feb 27, 2023
@facebook-github-bot
Copy link
Contributor

@ejguan has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@ejguan merged this pull request in 0fef12a.

NivekT pushed a commit that referenced this pull request Feb 28, 2023
Summary:
Single key option for Slicer and doc improvements

### Changes

- Enable Slicer to also work for a single key + functional test
- Fix typos in doc
- Add laion-example to examples page

Pull Request resolved: #1041

Reviewed By: NivekT

Differential Revision: D43622504

Pulled By: ejguan

fbshipit-source-id: b656082598f4a790dc457dddb0213a1a180239fd
NivekT added a commit that referenced this pull request Feb 28, 2023
Summary:
Single key option for Slicer and doc improvements

### Changes

- Enable Slicer to also work for a single key + functional test
- Fix typos in doc
- Add laion-example to examples page

Pull Request resolved: #1041

Reviewed By: NivekT

Differential Revision: D43622504

Pulled By: ejguan

fbshipit-source-id: b656082598f4a790dc457dddb0213a1a180239fd

Co-authored-by: SvenDS9 <sven.braun@tngtech.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Merged

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants