Skip to content

[data] split test_all_to_all.py#53865

Merged
can-anyscale merged 2 commits intomasterfrom
can-dsplit
Jun 17, 2025
Merged

[data] split test_all_to_all.py#53865
can-anyscale merged 2 commits intomasterfrom
can-dsplit

Conversation

@can-anyscale
Copy link
Copy Markdown
Contributor

@can-anyscale can-anyscale commented Jun 16, 2025

The test_all_to_all.py is taking a really long time to finish (35 minutes). I'm breaking this into smaller chunks so they run in parallel.

Test:

CI

@can-anyscale can-anyscale added the go add ONLY when ready to merge, run all tests label Jun 16, 2025
@can-anyscale can-anyscale marked this pull request as ready for review June 16, 2025 22:48
Copilot AI review requested due to automatic review settings June 16, 2025 22:48
@can-anyscale can-anyscale requested a review from a team as a code owner June 16, 2025 22:48
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR splits the monolithic test_all_to_all.py into several smaller end-to-end tests to speed up runtime by running tests in parallel. The changes include new test files for unique, repartition, random shuffle, map groups, and aggregation functionalities as well as updates in the BUILD file to reflect these changes.

  • Splitting tests for aggregation, unique, repartition, random shuffle, and group-by operations.
  • Updating the BUILD configuration to assign separate test targets.
  • Refactoring test logic for improved parallelism and resource-specific behavior.

Reviewed Changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
python/ray/data/tests/test_unique_e2e.py New tests for unique operation and null handling.
python/ray/data/tests/test_repartition_e2e.py New tests for repartition logic with and without shuffling.
python/ray/data/tests/test_random_e2e.py New tests for random shuffle operations and order verification.
python/ray/data/tests/test_map_groups_e2e.py New tests for groupby map operations using GPUs and actors.
python/ray/data/tests/test_agg_e2e.py New tests for aggregation logic and error validation.
python/ray/data/BUILD Updated BUILD configuration to reflect split test targets.

Copy link

Copilot AI Jun 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Consider adding an explicit assert statement here (e.g., assert ds.schema().names == ["a", "b"]) to validate that the schema is as expected.

Suggested change
ds.schema().names == ["a", "b"]
assert ds.schema().names == ["a", "b"], "Schema does not match the expected structure ['a', 'b']"

Copilot uses AI. Check for mistakes.
Copy link

Copilot AI Jun 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] When using 'len(keys)' inside check_init, consider explicitly checking the type of 'keys' (e.g. whether it is a list or a string) to make the intent clearer and avoid potential confusion.

Suggested change
def check_init(k):
def check_init(k):
if not isinstance(keys, (list, str)):
raise TypeError(f"'keys' must be a list or a string, but got {type(keys).__name__}")

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Member

@bveeramani bveeramani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ty

Signed-off-by: can <can@anyscale.com>
@can-anyscale can-anyscale enabled auto-merge (squash) June 17, 2025 22:17
@github-actions github-actions bot disabled auto-merge June 17, 2025 22:17
@can-anyscale can-anyscale merged commit efd1c15 into master Jun 17, 2025
6 checks passed
@can-anyscale can-anyscale deleted the can-dsplit branch June 17, 2025 23:23
elliot-barn pushed a commit that referenced this pull request Jun 18, 2025
The test_all_to_all.py is taking a really long time to finish ([35
minutes](https://buildkite.com/ray-project/postmerge/builds/10883/steps/canvas?jid=0197797e-9c25-4030-a1a6-ff89dba44f8e#0197797e-9c25-4030-a1a6-ff89dba44f8e/176-1023)).
I'm breaking this into smaller chunks so they run in parallel.

Test:

CI

Signed-off-by: can <can@anyscale.com>
Signed-off-by: elliot-barn <elliot.barnwell@anyscale.com>
minerharry pushed a commit to minerharry/ray that referenced this pull request Jun 27, 2025
The test_all_to_all.py is taking a really long time to finish ([35
minutes](https://buildkite.com/ray-project/postmerge/builds/10883/steps/canvas?jid=0197797e-9c25-4030-a1a6-ff89dba44f8e#0197797e-9c25-4030-a1a6-ff89dba44f8e/176-1023)).
I'm breaking this into smaller chunks so they run in parallel.

Test:

CI

Signed-off-by: can <can@anyscale.com>
elliot-barn pushed a commit that referenced this pull request Jul 2, 2025
The test_all_to_all.py is taking a really long time to finish ([35
minutes](https://buildkite.com/ray-project/postmerge/builds/10883/steps/canvas?jid=0197797e-9c25-4030-a1a6-ff89dba44f8e#0197797e-9c25-4030-a1a6-ff89dba44f8e/176-1023)).
I'm breaking this into smaller chunks so they run in parallel.

Test:

CI

Signed-off-by: can <can@anyscale.com>
Signed-off-by: elliot-barn <elliot.barnwell@anyscale.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

go add ONLY when ready to merge, run all tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants