Skip to content

Conversation

@corwinjoy
Copy link
Contributor

Which issue does this PR close?

Closes apache/arrow-rs-object-store#24.

Rationale for this change

Performance enhancement needed for network file system as described in the PR.

What changes are included in this PR?

Add an implementation LocalFileSystem::list_with_offset so that it can intelligently scan files and drastically reduce the number of statx system calls. Basically, we add a prefilter to pull only what is needed in-place.

Are there any user-facing changes?

None.

@corwinjoy
Copy link
Contributor Author

The initial work for this PR was done by @dplasto, who reported the issue and provided an initial implementation. Further suggestions were provided by @adamreeve .

@tustvold
Copy link
Contributor

It would appear this is legitimately failing CI

@alamb alamb marked this pull request as draft January 25, 2025 17:13
@alamb
Copy link
Contributor

alamb commented Jan 25, 2025

Thanks @corwinjoy and @tustvold -- marking this as a draft as we work through the test failures

@corwinjoy
Copy link
Contributor Author

corwinjoy commented Jan 25, 2025 via email

@etseidl
Copy link
Contributor

etseidl commented Jan 26, 2025

I'm not sure what is happening with the CI. It seems to be failing formatting. But, I ran 'cargo format' and clippy as specified in Contributing.md so I am not sure what is wrong.

Are you running from the arrow-rs directory or the object_store directory. When I run cargo fmt --all -- --check from the object_store directory I get similar errors to CI.

@corwinjoy
Copy link
Contributor Author

corwinjoy commented Jan 26, 2025

Ah, thanks for the clarification @etseidl! From the guidelines I didn't realize that I had to run these commands in the object-store rather than the arrow-rs directory. Doing that, I do get suggested changes which I have applied. Hopefully this will clear the CI checks.

@alamb
Copy link
Contributor

alamb commented Jan 27, 2025

Ah, thanks for the clarification @etseidl! From the guidelines I didn't realize that I had to run these commands in the object-store rather than the arrow-rs directory. Doing that, I do get suggested changes which I have applied. Hopefully this will clear the CI checks.

The emulator tests appear to be failing: https://github.com/apache/arrow-rs/actions/runs/12970828706/job/36180799417?pr=7019

---- local::tests::test_path_with_offset stdout ----
thread 'local::tests::test_path_with_offset' panicked at src/local.rs:1440:66:
called `Result::unwrap()` on an `Err` value: Generic { store: "LocalFileSystem", source: UnableToCanonicalize { path: "../testing/data/arrow-ipc-file", source: Os { code: 2, kind: NotFound, message: "No such file or directory" } } }
stack backtrace:

…or testing rather than pointing to existing dir.
@corwinjoy
Copy link
Contributor Author

OK. I have changed this test to use a temporary test and files which should make it more portable. We will see what the emulator says about this.

@alamb
Copy link
Contributor

alamb commented Jan 29, 2025

OK. I have changed this test to use a temporary test and files which should make it more portable. We will see what the emulator says about this.

Looks like the tests are good

@alamb
Copy link
Contributor

alamb commented Jan 29, 2025

This PR looks ready for review to me. @corwinjoy shall we mark it as ready for review?

@corwinjoy
Copy link
Contributor Author

@alamb Yes please. Mark it as ready for review.

@alamb alamb marked this pull request as ready for review January 29, 2025 21:53
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @corwinjoy -- I think this code is better than what is on main.

I am somewhat concerned that the listing with offset is going to result in non deterministic results but I don't think this PR makes that any better/worse than what is on main

Thanks again

None => config.root.to_file_path().unwrap(),
};

let walkdir = WalkDir::new(root_path)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did some digging and I don't think WalkDir guarantees anything about the sort order of the directory entries

thus if the OS returns a different order on the next call to list_with_offset this may end up with a different result

However I think the existing default impl of list_with_offset (which just calls list() and takes the offset) has the same problem so I don't think it needs to be fixed in this PR

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does look to me like you can explicitly set an order via https://docs.rs/walkdir/latest/walkdir/struct.WalkDir.html#method.sort_by

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alamb I agree with all this and thanks for the helpful comments! I think your concern about not having a deterministic order is a good one and agree this is the existing behavior. I also think we should change this, but propose we do that in a follow-up PR since it does change the behavior and could (potentially) break an assumption somewhere. I think we can just use https://docs.rs/walkdir/latest/walkdir/struct.WalkDir.html#method.sort_by_file_name
which should be fine and I believe matches what the offset filter expects.

.map(|x| String::from(x.filename().unwrap()))
.collect();

// Check result with direct filesystem read
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this depends on the file system returning the same order each time. Again this is probably ok but it seems like it could result in a non deterministic test failure over time

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, it does not depend on this because later, in line 1485 we sort both result sets before comparing. So the initial sort order that gets returned does not matter.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking that if the underlying directory list was different, then the values after skipping the first 10 (for example) might be different, even if we sorted the values that came out.

But since this doesn't seem to happen in practice I am not going to worry too much about it

Copy link
Contributor

@tustvold tustvold Feb 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've not had time to review this in detail but a couple of points that may or may not be relevant:

Or to phrase it either way, we shouldn't need to sort, nor should we rely on data being sorted

For context on why we don't guarantee any ordering, it is because many stores, e.g. S3 Express, don't

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense -- thank you

To be clear I don't think this PR changes anything about the object_store operation in the face of inconsistent ordering between calls (but I was speculating on what would happen if the ordering returned by the underlying systems was different between calls)

@alamb
Copy link
Contributor

alamb commented Feb 7, 2025

I'll plan to merge this PR in a day or two unless there are any more comments or anyone would like longer to review

@alamb alamb merged commit 5bf4a6c into apache:main Feb 8, 2025
14 checks passed
@alamb
Copy link
Contributor

alamb commented Feb 8, 2025

Thanks again @corwinjoy

phillipleblanc pushed a commit to spiceai/arrow-rs that referenced this pull request Feb 10, 2025
…he#7019)

* Initial change from Daniel.

* Upgrade unit test to be more generic.

* Add comments on why we have filter

* Cleanup unit tests.

* Update object_store/src/local.rs

Co-authored-by: Adam Reeve <adreeve@gmail.com>

* Add changes suggested by Adam.

* Cleanup match error.

* Apply formatting changes suggested by cargo +stable fmt --all.

* Apply cosmetic changes suggested by clippy.

* Upgrade test_path_with_offset to create temporary directory + files for testing rather than pointing to existing dir.

---------

Co-authored-by: Adam Reeve <adreeve@gmail.com>
alamb added a commit that referenced this pull request Feb 12, 2025
…e32 (#7074)

* Support converting large dates (i.e. +10999-12-31) from string to Date32

* Fix lint

* Update arrow-cast/src/parse.rs

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

* fix: issue introduced in #6833 -  less than equal check for scale in decimal conversion (#7070)

* fix <= check for scale in decimal conversion

* Update arrow-cast/src/cast/mod.rs

name change

Co-authored-by: Arttu <Blizzara@users.noreply.github.com>

* remove incorrect comment

---------

Co-authored-by: Arttu <Blizzara@users.noreply.github.com>

* minor: re-export `OffsetBufferBuilder` in `arrow` crate (#7077)

* Add another decimal cast edge test case (#7078)

* Add another decimal cast edge test case

Before 1019f5b this test would fail, as
the cast produced 1. 0 is an edge case worth explicitly testing for.

* typo/fmt

Co-authored-by: Felipe Oliveira Carvalho <felipekde@gmail.com>

---------

Co-authored-by: Felipe Oliveira Carvalho <felipekde@gmail.com>

* Support both 0x01 and 0x02 as type for list of booleans in thrift metadata (#7052)

* Support both 0x01 and 0x02 as type for list of booleans

* Also support 0 for false inside boolean collections

* Use hex notation in tests

* Fix LocalFileSystem with range request that ends beyond end of file (#6751)

* Fix LocalFileSystem with range request that ends beyond end of file

* fix windows

* add comment

* Seek error

* fix seek check

* remove windows flag

* Get file length from file metadata

* Introduce `UnsafeFlag` to manage disabling `ArrayData` validation (#7027)

* Introduce UnsafeFlag to manage disabling validation

* fix docs

* Refactor arrow-ipc: Rename `ArrayReader` to `RecodeBatchDecoder` (#7028)

* Rename `ArrayReader` to `RecordBatchDecoder`

* Remove alias for `self`

* Minor: Update release schedule (#7086)

* Minor: Update release schedule

* realism

* Refactor some decimal-related code and tests (#7062)

* Refactor some decimal-related code and tests in preparation for adding Decimal32 and Decimal64 support

* Fixed symbol

* Apply PR feedback

* Fixed format problem

* Fixed logical merge conflicts

* PR feedback

* Refactor arrow-ipc: Move `create_*_array` methods into `RecordBatchDecoder` (#7029)

* Move `create_primitive_array` into RecordBatchReader

* Move `create_list-array` into RecordBatchReader

* Move `create_dictionay_array` into RecordBatchReader

* Print Parquet BasicTypeInfo id when present (#7094)

* Print Parquet BasicTypeInfo id when present

* Improve print_schema documentation

* tiny cleanup

* Add a custom implementation `LocalFileSystem::list_with_offset`  (#7019)

* Initial change from Daniel.

* Upgrade unit test to be more generic.

* Add comments on why we have filter

* Cleanup unit tests.

* Update object_store/src/local.rs

Co-authored-by: Adam Reeve <adreeve@gmail.com>

* Add changes suggested by Adam.

* Cleanup match error.

* Apply formatting changes suggested by cargo +stable fmt --all.

* Apply cosmetic changes suggested by clippy.

* Upgrade test_path_with_offset to create temporary directory + files for testing rather than pointing to existing dir.

---------

Co-authored-by: Adam Reeve <adreeve@gmail.com>

* fix: first none/empty list in `ListArray` panics in `cast_with_options` (#7065)

* fix: first none in `ListArray` panics in `cast_with_options`

* simplify

* fix

* Update arrow-cast/src/cast/list.rs

Co-authored-by: Jeffrey Vo <jeffrey.vo.australia@gmail.com>

---------

Co-authored-by: Jeffrey Vo <jeffrey.vo.australia@gmail.com>

* Benchmarks for Arrow IPC writer (#7090)

* Add benchmarks for Arrow IPC writer

* Add benchmarks for Arrow IPC writer

* reuse target buffer

* rename, etc

* Add compression type

* update

---------

Co-authored-by: Andy Grove <agrove@apache.org>

* Minor: Clarify documentation on `NullBufferBuilder::allocated_size` (#7089)

* Minor: Clarify documentaiton on NullBufferBuilder::allocated_size

* add note about why allocations are 64 bytes

* Add more tests for edge cases

* Add negative test case for incorrectly formatted large dates

---------

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Co-authored-by: Himadri Pal <mehimu@gmail.com>
Co-authored-by: Arttu <Blizzara@users.noreply.github.com>
Co-authored-by: Piotr Findeisen <piotr.findeisen@gmail.com>
Co-authored-by: Felipe Oliveira Carvalho <felipekde@gmail.com>
Co-authored-by: Jörn Horstmann <git@jhorstmann.net>
Co-authored-by: Kyle Barron <kyle@developmentseed.org>
Co-authored-by: Curt Hagenlocher <curt@hagenlocher.org>
Co-authored-by: Devin Smith <devinsmith@deephaven.io>
Co-authored-by: Corwin Joy <corwin.joy@gmail.com>
Co-authored-by: Adam Reeve <adreeve@gmail.com>
Co-authored-by: irenjj <renj.jiang@gmail.com>
Co-authored-by: Jeffrey Vo <jeffrey.vo.australia@gmail.com>
Co-authored-by: Andy Grove <agrove@apache.org>
alamb pushed a commit to alamb/arrow-rs that referenced this pull request Mar 20, 2025
…he#7019)

* Initial change from Daniel.

* Upgrade unit test to be more generic.

* Add comments on why we have filter

* Cleanup unit tests.

* Update object_store/src/local.rs

Co-authored-by: Adam Reeve <adreeve@gmail.com>

* Add changes suggested by Adam.

* Cleanup match error.

* Apply formatting changes suggested by cargo +stable fmt --all.

* Apply cosmetic changes suggested by clippy.

* Upgrade test_path_with_offset to create temporary directory + files for testing rather than pointing to existing dir.

---------

Co-authored-by: Adam Reeve <adreeve@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

LocalFileSystem::list_with_offset is very slow over network file system

4 participants