Skip to content

[data] Remove dead FastFileMetadataProvider code#59027

Merged
bveeramani merged 3 commits intoray-project:masterfrom
rushikeshadhav:remove-fast-file-metadata-provider
Dec 28, 2025
Merged

[data] Remove dead FastFileMetadataProvider code#59027
bveeramani merged 3 commits intoray-project:masterfrom
rushikeshadhav:remove-fast-file-metadata-provider

Conversation

@rushikeshadhav
Copy link
Copy Markdown
Contributor

Description

After removing the deprecated read_parquet_bulk API, FastFileMetadataProvider became dead code with no remaining usage in the codebase.

This commit removes:

  • FastFileMetadataProvider class implementation
  • All imports and exports of FastFileMetadataProvider
  • Tests that specifically tested FastFileMetadataProvider
  • Documentation references to FastFileMetadataProvider
  • Code comments mentioning FastFileMetadataProvider

Related issues

Fixes #59010

@rushikeshadhav rushikeshadhav requested a review from a team as a code owner November 27, 2025 05:48
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request does a great job of removing the dead code for FastFileMetadataProvider. The changes are clean and cover all necessary areas including implementation, documentation, and tests. I have one suggestion to improve a test that becomes a bit confusing after the code removal.

@ray-gardener ray-gardener bot added docs An issue or change related to documentation data Ray Data-related issues community-contribution Contributed by the community labels Nov 27, 2025
@owenowenisme
Copy link
Copy Markdown
Member

Thanks for your contribution!
Left some comments that need to be addressed, thanks!

@omatthew98 omatthew98 added the @external-author-action-required Alternate tag for PRs where the author doesn't have labeling permission. label Dec 3, 2025
@bveeramani
Copy link
Copy Markdown
Member

Hey @rushikeshadhav, just checking in on this when you get a chance

@rushikeshadhav rushikeshadhav force-pushed the remove-fast-file-metadata-provider branch from d7013d6 to 079fdc7 Compare December 9, 2025 17:11
Copy link
Copy Markdown

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Test execution code removed instead of replaced with alternative (Bugbot Rules)

The test_read_binary_meta_provider test function has its main positive test case removed entirely. The code that reads binary files with FastFileMetadataProvider and asserts the result was deleted, leaving only the setup code and the negative pytest.raises test. Per reviewer comment, this functionality needs to be preserved using DefaultFileMetadataProvider() instead of simply deleting the test execution.

python/ray/data/tests/test_binary.py#L88-L89

snappy.stream_compress(bytes, f)

Fix in Cursor Fix in Web


@rushikeshadhav
Copy link
Copy Markdown
Contributor Author

@owenowenisme , I have made the required changes, please check

@rushikeshadhav
Copy link
Copy Markdown
Contributor Author

Hey @rushikeshadhav, just checking in on this when you get a chance

Thanks for the heads up, I have resolved the comments

@github-actions
Copy link
Copy Markdown

This pull request has been automatically marked as stale because it has not had
any activity for 14 days. It will be closed in another 14 days if no further activity occurs.
Thank you for your contributions.

You can always ask for help on our discussion forum or Ray's public slack channel.

If you'd like to keep this open, just leave any comment, and the stale label will be removed.

@github-actions github-actions bot added the stale The issue is stale. It will be closed within 7 days unless there are further conversation label Dec 24, 2025
After removing the deprecated read_parquet_bulk API, FastFileMetadataProvider
became dead code with no remaining usage in the codebase.

This commit removes:
- FastFileMetadataProvider class implementation
- All imports and exports of FastFileMetadataProvider
- Tests that specifically tested FastFileMetadataProvider
- Documentation references to FastFileMetadataProvider
- Code comments mentioning FastFileMetadataProvider

The class was previously used as a faster alternative to DefaultFileMetadataProvider
that skipped directory expansion and file size collection, but it's no longer needed
after the bulk API removal.

Signed-off-by: rushikesh.adhav <adhavrushikesh6@gmail.com>
Signed-off-by: Rushikesh Adhav <adhavrushikesh6@gmail.com>
Signed-off-by: Rushikesh Adhav <adhavrushikesh6@gmail.com>
@rushikeshadhav rushikeshadhav force-pushed the remove-fast-file-metadata-provider branch from 6f79618 to 63e2a8f Compare December 27, 2025 08:32
@owenowenisme owenowenisme added go add ONLY when ready to merge, run all tests and removed stale The issue is stale. It will be closed within 7 days unless there are further conversation labels Dec 27, 2025
Copy link
Copy Markdown
Member

@owenowenisme owenowenisme left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@bveeramani bveeramani merged commit 639451f into ray-project:master Dec 28, 2025
7 checks passed
AYou0207 pushed a commit to AYou0207/ray that referenced this pull request Jan 13, 2026
## Description
After removing the deprecated `read_parquet_bulk` API,
`FastFileMetadataProvider` became dead code with no remaining usage in
the codebase.

This commit removes:
- FastFileMetadataProvider class implementation
- All imports and exports of FastFileMetadataProvider
- Tests that specifically tested FastFileMetadataProvider
- Documentation references to FastFileMetadataProvider
- Code comments mentioning FastFileMetadataProvider

## Related issues
> Fixes ray-project#59010

---------

Signed-off-by: rushikesh.adhav <adhavrushikesh6@gmail.com>
Signed-off-by: Rushikesh Adhav <adhavrushikesh6@gmail.com>
Signed-off-by: jasonwrwang <jasonwrwang@tencent.com>
bveeramani added a commit that referenced this pull request Jan 19, 2026
…es (#60084) (#60091)

Categorize APIs into Public APIs and Developer APIs, and sort them
alphabetically by service name.

Changes:
- Reorganized loading_data.rst and saving_data.rst with Public APIs
first, then Developer APIs
- Sorted all APIs alphabetically by service name within each section
- Sections that originally had APIs for both Public and Developer APIs
were divided to respective sections
- Removed datasource.FastFileMetadataProvider API that has been removed
([reference](#59027))

Fixes #60084

Signed-off-by: mgchoi239 <mg.choi.239@gmail.com>

---------

Signed-off-by: mgchoi239 <mg.choi.239@gmail.com>
Signed-off-by: Balaji Veeramani <bveeramani@berkeley.edu>
Co-authored-by: mgchoi239 <mg.choi.239@gmail.com>
Co-authored-by: Balaji Veeramani <bveeramani@berkeley.edu>
jinbum-kim pushed a commit to jinbum-kim/ray that referenced this pull request Jan 29, 2026
…es (ray-project#60084) (ray-project#60091)

Categorize APIs into Public APIs and Developer APIs, and sort them
alphabetically by service name.

Changes:
- Reorganized loading_data.rst and saving_data.rst with Public APIs
first, then Developer APIs
- Sorted all APIs alphabetically by service name within each section
- Sections that originally had APIs for both Public and Developer APIs
were divided to respective sections
- Removed datasource.FastFileMetadataProvider API that has been removed
([reference](ray-project#59027))

Fixes ray-project#60084

Signed-off-by: mgchoi239 <mg.choi.239@gmail.com>

---------

Signed-off-by: mgchoi239 <mg.choi.239@gmail.com>
Signed-off-by: Balaji Veeramani <bveeramani@berkeley.edu>
Co-authored-by: mgchoi239 <mg.choi.239@gmail.com>
Co-authored-by: Balaji Veeramani <bveeramani@berkeley.edu>
Signed-off-by: jinbum-kim <jinbum9958@gmail.com>
lee1258561 pushed a commit to pinterest/ray that referenced this pull request Feb 3, 2026
## Description
After removing the deprecated `read_parquet_bulk` API,
`FastFileMetadataProvider` became dead code with no remaining usage in
the codebase.

This commit removes:
- FastFileMetadataProvider class implementation
- All imports and exports of FastFileMetadataProvider
- Tests that specifically tested FastFileMetadataProvider
- Documentation references to FastFileMetadataProvider
- Code comments mentioning FastFileMetadataProvider

## Related issues
> Fixes ray-project#59010

---------

Signed-off-by: rushikesh.adhav <adhavrushikesh6@gmail.com>
Signed-off-by: Rushikesh Adhav <adhavrushikesh6@gmail.com>
ryanaoleary pushed a commit to ryanaoleary/ray that referenced this pull request Feb 3, 2026
…es (ray-project#60084) (ray-project#60091)

Categorize APIs into Public APIs and Developer APIs, and sort them
alphabetically by service name.

Changes:
- Reorganized loading_data.rst and saving_data.rst with Public APIs
first, then Developer APIs
- Sorted all APIs alphabetically by service name within each section
- Sections that originally had APIs for both Public and Developer APIs
were divided to respective sections
- Removed datasource.FastFileMetadataProvider API that has been removed
([reference](ray-project#59027))

Fixes ray-project#60084

Signed-off-by: mgchoi239 <mg.choi.239@gmail.com>

---------

Signed-off-by: mgchoi239 <mg.choi.239@gmail.com>
Signed-off-by: Balaji Veeramani <bveeramani@berkeley.edu>
Co-authored-by: mgchoi239 <mg.choi.239@gmail.com>
Co-authored-by: Balaji Veeramani <bveeramani@berkeley.edu>
peterxcli pushed a commit to peterxcli/ray that referenced this pull request Feb 25, 2026
## Description
After removing the deprecated `read_parquet_bulk` API,
`FastFileMetadataProvider` became dead code with no remaining usage in
the codebase.

This commit removes:
- FastFileMetadataProvider class implementation
- All imports and exports of FastFileMetadataProvider
- Tests that specifically tested FastFileMetadataProvider
- Documentation references to FastFileMetadataProvider
- Code comments mentioning FastFileMetadataProvider

## Related issues
> Fixes ray-project#59010

---------

Signed-off-by: rushikesh.adhav <adhavrushikesh6@gmail.com>
Signed-off-by: Rushikesh Adhav <adhavrushikesh6@gmail.com>
Signed-off-by: peterxcli <peterxcli@gmail.com>
peterxcli pushed a commit to peterxcli/ray that referenced this pull request Feb 25, 2026
…es (ray-project#60084) (ray-project#60091)

Categorize APIs into Public APIs and Developer APIs, and sort them
alphabetically by service name.

Changes:
- Reorganized loading_data.rst and saving_data.rst with Public APIs
first, then Developer APIs
- Sorted all APIs alphabetically by service name within each section
- Sections that originally had APIs for both Public and Developer APIs
were divided to respective sections
- Removed datasource.FastFileMetadataProvider API that has been removed
([reference](ray-project#59027))

Fixes ray-project#60084

Signed-off-by: mgchoi239 <mg.choi.239@gmail.com>

---------

Signed-off-by: mgchoi239 <mg.choi.239@gmail.com>
Signed-off-by: Balaji Veeramani <bveeramani@berkeley.edu>
Co-authored-by: mgchoi239 <mg.choi.239@gmail.com>
Co-authored-by: Balaji Veeramani <bveeramani@berkeley.edu>
Signed-off-by: peterxcli <peterxcli@gmail.com>
peterxcli pushed a commit to peterxcli/ray that referenced this pull request Feb 25, 2026
…es (ray-project#60084) (ray-project#60091)

Categorize APIs into Public APIs and Developer APIs, and sort them
alphabetically by service name.

Changes:
- Reorganized loading_data.rst and saving_data.rst with Public APIs
first, then Developer APIs
- Sorted all APIs alphabetically by service name within each section
- Sections that originally had APIs for both Public and Developer APIs
were divided to respective sections
- Removed datasource.FastFileMetadataProvider API that has been removed
([reference](ray-project#59027))

Fixes ray-project#60084

Signed-off-by: mgchoi239 <mg.choi.239@gmail.com>

---------

Signed-off-by: mgchoi239 <mg.choi.239@gmail.com>
Signed-off-by: Balaji Veeramani <bveeramani@berkeley.edu>
Co-authored-by: mgchoi239 <mg.choi.239@gmail.com>
Co-authored-by: Balaji Veeramani <bveeramani@berkeley.edu>
Signed-off-by: peterxcli <peterxcli@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

community-contribution Contributed by the community data Ray Data-related issues docs An issue or change related to documentation @external-author-action-required Alternate tag for PRs where the author doesn't have labeling permission. go add ONLY when ready to merge, run all tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Data] Remove dead FastFileMetadataProvider

4 participants