Skip to content

[Data][Train] Remove top-level ray.data imports to decouple Ray Train from Ray Data #60152

@bveeramani

Description

@bveeramani

Description

Remove all top-level imports of ray.data from the ray.train module. Imports needed only for type annotations should be guarded behind if TYPE_CHECKING:. Imports needed at runtime should be moved inline (lazy imports within functions/methods).

Background

Ray Train currently has direct top-level imports of ray.data scattered throughout the codebase. These imports create a hard coupling between the two modules—importing ray.train transitively imports ray.data.

The TYPE_CHECKING pattern allows type annotations to reference external types without runtime imports:

from typing import TYPE_CHECKING

if TYPE_CHECKING:
    from ray.data import Dataset

def my_function(ds: "Dataset") -> None:  # Note: string annotation
    ...

For runtime usage, lazy/inline imports defer the import until the code path is actually executed:

def my_function():
    from ray.data import Dataset  # Only imported when function is called
    ...

Motivation

We want to stop running all Ray Train tests on Ray Data premerge. To ensure Ray Train tests (excluding Data integration tests) don't actually depend on Ray Data, we need to be able to run them without the ray.data source code present. Removing top-level imports makes this dependency explicit and enforceable by the system. Ray Core did something similar to decouple its modules.

Implementation Boundaries & Constraints

Files requiring changes (source files only, excludes tests):

  • python/ray/train/**/*.py

Key considerations:

  • When moving type annotations to TYPE_CHECKING, use string quotes for forward references: def foo(ds: "Dataset") instead of def foo(ds: Dataset)

Metadata

Metadata

Assignees

Labels

dataRay Data-related issuestech-debtThe issue that's due to tech debttrainRay Train Related Issue

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions