Skip to content

Add Examples of Common Preprocessing Steps with IterDataPipe (such as splitting a data set into two) #712

@NivekT

Description

@NivekT

📚 The doc issue

There are a few common steps that users often would like to do while preprocessing data, such as splitting their data set into train and eval. There are documentation in PyTorch Core about how to do these things with Dataset. We should add the same to our documentation, specifically for IterDataPipe. Or create a link to PyTorch Core's documentation for reference when that is appropriate. This issue is driven by common questions we have received either in person or on the forum.

If we find that any functionality is missing for IterDataPipe, we should implement them.

Metadata

Metadata

Assignees

No one assigned

    Labels

    documentationImprovements or additions to documentation

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions