[train] Cleanups for training ingest benchmark by justinvyu · Pull Request #53684 · ray-project/ray

justinvyu · 2025-06-10T01:08:10Z

Summary

This PR does some cleanup for the training benchmark:

Introduces task level configs so that we don't need to create a new task per variant of the image classification task.
Moves some configuration setting to logical places (ex: grouping all Ray Data configs in one place).
Deduplicates some of the redundant "benchmark factories" that were created for the image classification data format / data storage variants.
Misc. file/directory renames for conciseness.

Signed-off-by: Justin Yu <justinvyu@anyscale.com>

This reverts commit fd8e8a5. Signed-off-by: Justin Yu <justinvyu@anyscale.com>

Signed-off-by: Justin Yu <justinvyu@anyscale.com>

Copilot

Pull Request Overview

This PR refactors the training ingest benchmark by introducing task-level configurations, consolidating and deduplicating image classification factories, and reorganizing where dataloader settings live.

Add TaskConfig and ImageClassificationConfig to centralize per-task settings and remove per-variant tasks.
Move batch‐size and row‐limit fields into DataLoaderConfig subclasses and update factories to call get_dataloader_config().
Deduplicate image‐classification factories (JPEG/Parquet) under a single ImageClassificationFactory using injected data_dirs.

Reviewed Changes

Copilot reviewed 22 out of 22 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
release/train_tests/benchmark/runner.py	Pass `dataset_creation_time` into `get_metrics` and update imports
release/train_tests/benchmark/recsys/recsys_factory.py	Fix import path for `BenchmarkFactory`
release/train_tests/benchmark/ray_dataloader_factory.py	Add abstract `get_ray_datasets` and `get_ray_data_config`
release/train_tests/benchmark/image_classification/parquet/factory.py	Inject `data_dirs` and use `get_dataloader_config().limit_*`
release/train_tests/benchmark/image_classification/localfs_image_classification_jpeg/factory.py	Remove obsolete localfs‐JPEG factory
release/train_tests/benchmark/image_classification/localfs_image_classification_jpeg/init.py	Remove empty module docstring
release/train_tests/benchmark/image_classification/jpeg/factory.py	Inject `data_dirs`, remove hardcoded dirs, and use limits
release/train_tests/benchmark/image_classification/imagenet.py	Add `IMAGENET_LOCALFS_SPLIT_DIRS` and import `DatasetKey`
release/train_tests/benchmark/image_classification/factory.py	New unified `ImageClassificationFactory` and helper `get_imagenet_data_dirs`
release/train_tests/benchmark/dataloader_factory.py	Remove unused stub methods
release/train_tests/benchmark/config.py	Add `TaskConfig` types and move row‐limit fields to `DataLoaderConfig`
release/train_tests/benchmark/benchmark_factory.py	Remove deprecated dataset methods
release/release_tests.yaml	Update test scripts for new task and flag names

release/train_tests/benchmark/image_classification/parquet/factory.py

release/release_tests.yaml

release/train_tests/benchmark/ray_dataloader_factory.py

…hmark_minimal_cleanup

Signed-off-by: Justin Yu <justinvyu@anyscale.com>

srinathk10

Nice. Thanks for the restructure.

justinvyu · 2025-06-18T00:43:26Z

release/train_tests/benchmark/train_benchmark.py

+        datasets = {}
+        data_config = None

-    factory.set_dataset_creation_time(time.perf_counter() - start_time)


I think the dataset creation time previously did not capture the actual Ray Dataset construction (in get_ray_datasets).

I updated it to capture the range. Just want to double check that this is accurate.

Ah ok. Good catch! Hope that get_ray_datasets call is negligible (sub-second).

This reverts commit 1957ce2.

This PR does some cleanup for the training benchmark: * Introduces task level configs so that we don't need to create a new task per variant of the image classification task. * Moves some configuration setting to logical places (ex: grouping all Ray Data configs in one place). * Deduplicates some of the redundant "benchmark factories" that were created for the image classification data format / data storage variants. * Misc. file/directory renames for conciseness. --------- Signed-off-by: Justin Yu <justinvyu@anyscale.com>

This PR does some cleanup for the training benchmark: * Introduces task level configs so that we don't need to create a new task per variant of the image classification task. * Moves some configuration setting to logical places (ex: grouping all Ray Data configs in one place). * Deduplicates some of the redundant "benchmark factories" that were created for the image classification data format / data storage variants. * Misc. file/directory renames for conciseness. --------- Signed-off-by: Justin Yu <justinvyu@anyscale.com> Signed-off-by: elliot-barn <elliot.barnwell@anyscale.com>

justinvyu added 24 commits June 9, 2025 16:20

move some data context setting to a better place

df9259e

Signed-off-by: Justin Yu <justinvyu@anyscale.com>

move ray data config

ded57c2

Signed-off-by: Justin Yu <justinvyu@anyscale.com>

remove unused train report

b05a4ae

Signed-off-by: Justin Yu <justinvyu@anyscale.com>

move data configs to ray data config

f203030

Signed-off-by: Justin Yu <justinvyu@anyscale.com>

move dataset limit values to dl config

3725c65

Signed-off-by: Justin Yu <justinvyu@anyscale.com>

add img clf factory skeleton

1104f2c

Signed-off-by: Justin Yu <justinvyu@anyscale.com>

add task config

63bc3bb

Signed-off-by: Justin Yu <justinvyu@anyscale.com>

implement the task config routing for img clf ray data

133b6ca

Signed-off-by: Justin Yu <justinvyu@anyscale.com>

finish for torch dl variants

2e5638b

Signed-off-by: Justin Yu <justinvyu@anyscale.com>

remove unused redundant factories

1a45c12

Signed-off-by: Justin Yu <justinvyu@anyscale.com>

remove localfs task special case

474ec94

Signed-off-by: Justin Yu <justinvyu@anyscale.com>

docstring

5d98b58

Signed-off-by: Justin Yu <justinvyu@anyscale.com>

remove unused method

c0f3b9a

Signed-off-by: Justin Yu <justinvyu@anyscale.com>

remove nit

04fa84d

Signed-off-by: Justin Yu <justinvyu@anyscale.com>

update release test commands

44c6495

Signed-off-by: Justin Yu <justinvyu@anyscale.com>

pt 2

e16343f

Signed-off-by: Justin Yu <justinvyu@anyscale.com>

fix impl of dataset creation time

fd8e8a5

Signed-off-by: Justin Yu <justinvyu@anyscale.com>

Revert "fix impl of dataset creation time"

800226b

This reverts commit fd8e8a5. Signed-off-by: Justin Yu <justinvyu@anyscale.com>

move dataset creation time impl out of factory

b5ea4d1

Signed-off-by: Justin Yu <justinvyu@anyscale.com>

rename some folders

34950cc

Signed-off-by: Justin Yu <justinvyu@anyscale.com>

move jpeg ds download bash script

ddfa446

Signed-off-by: Justin Yu <justinvyu@anyscale.com>

rename some things

1ecb12a

Signed-off-by: Justin Yu <justinvyu@anyscale.com>

fix

e44b156

Signed-off-by: Justin Yu <justinvyu@anyscale.com>

rename factory -> benchmark_factory

a3b75d0

Signed-off-by: Justin Yu <justinvyu@anyscale.com>

justinvyu requested review from Copilot and srinathk10 and removed request for Copilot June 10, 2025 01:08

Copilot AI reviewed Jun 10, 2025

View reviewed changes

release/train_tests/benchmark/image_classification/parquet/factory.py Show resolved Hide resolved

release/release_tests.yaml Show resolved Hide resolved

release/train_tests/benchmark/ray_dataloader_factory.py Show resolved Hide resolved

justinvyu added 2 commits June 10, 2025 10:41

Merge branch 'master' of https://github.com/ray-project/ray into benc…

835f18d

…hmark_minimal_cleanup

fix

2ed61dd

Signed-off-by: Justin Yu <justinvyu@anyscale.com>

srinathk10 approved these changes Jun 18, 2025

View reviewed changes

justinvyu commented Jun 18, 2025

View reviewed changes

justinvyu enabled auto-merge (squash) June 18, 2025 23:22

github-actions bot added the go add ONLY when ready to merge, run all tests label Jun 18, 2025

justinvyu merged commit 1957ce2 into ray-project:master Jun 19, 2025
7 checks passed

justinvyu deleted the benchmark_minimal_cleanup branch June 19, 2025 00:26

srinathk10 added a commit that referenced this pull request Jun 20, 2025

Revert "[train] Cleanups for training ingest benchmark (#53684)"

a2a81e1

This reverts commit 1957ce2.

srinathk10 mentioned this pull request Jun 20, 2025

Revert "[train] Cleanups for training ingest benchmark" #53979

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[train] Cleanups for training ingest benchmark#53684

[train] Cleanups for training ingest benchmark#53684
justinvyu merged 26 commits intoray-project:masterfrom
justinvyu:benchmark_minimal_cleanup

justinvyu commented Jun 10, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

srinathk10 left a comment

Uh oh!

justinvyu Jun 18, 2025

Uh oh!

srinathk10 Jun 18, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

justinvyu commented Jun 10, 2025

Summary

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

srinathk10 left a comment

Choose a reason for hiding this comment

Uh oh!

justinvyu Jun 18, 2025

Choose a reason for hiding this comment

Uh oh!

srinathk10 Jun 18, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants