Skip to content

[Data] Remove deprecated read_parquet_bulk #58969

@bveeramani

Description

@bveeramani

What should be done?

Remove the deprecated read_parquet_bulk API and its associated ParquetBulkDatasource class. The API was deprecated with a warning indicating removal after May 2025, and that deadline has passed.

Why?

The read_parquet_bulk function was deprecated in favor of read_parquet, which now handles all the same use cases. Keeping deprecated code around creates maintenance burden and confuses users.

The deprecation warning in the code states:

read_parquet_bulk is deprecated and will be removed after May 2025. Use read_parquet instead.

Files to modify

 1. **`python/ray/data/read_api.py`**
    - Remove the `read_parquet_bulk` function (lines ~1227-1373)
    - Remove any related imports (e.g., `ParquetBulkDatasource`)

 2. **`python/ray/data/__init__.py`**
    - Remove `read_parquet_bulk` from the import statement (line 69)
    - Remove `read_parquet_bulk` from the `__all__` list (line 178)

 3. **`python/ray/data/_internal/datasource/parquet_bulk_datasource.py`**
    - Delete this entire file

 4. **`python/ray/data/tests/test_parquet.py`**
    - Remove tests that use `read_parquet_bulk` (search for `read_parquet_bulk` in the file)

 5. **`doc/source/data/api/input_output.rst`**
    - Remove `read_parquet_bulk` from the Parquet section (line 35)
    
 6. **`python/ray/data/_internal/datasource/parquet_datasource.py`**
    - Update the docstring that mentions `ParquetBulkDatasource` (line ~284)

 7. Anything else this issue might've missed.

Verification

After making changes:
1. Run pre-commit to verify linting passes
2. Ensure the module still imports correctly: python -c "import ray.data"

Metadata

Metadata

Assignees

No one assigned

    Labels

    dataRay Data-related issuesenhancementRequest for new feature and/or capability

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions