Skip to content

[BUG] fix partition script in the DistributedDataParallel documentation #550

@rnyak

Description

@rnyak

Bug description

Currently in the document https://nvidia-merlin.github.io/Transformers4Rec/main/multi_gpu_train.html, we have the following script for users to repartiton the dataset. But this script is not working and is not repartitioning the dataset.

df.to_parquet("filename.parquet", row_group_size=10000)

This script should be modified in a way that the dataset is repartitioned properly.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions