GH-38618: [C++] S3FileSystem: fix regression in deleting explicitly created sub-directories#38845
Merged
jorisvandenbossche merged 7 commits intoapache:mainfrom Dec 5, 2023
Merged
Conversation
…itly created sub-directories
|
|
felipecrv
requested changes
Nov 22, 2023
Contributor
felipecrv
left a comment
There was a problem hiding this comment.
Thanks for investigating and fixing this!
cpp/src/arrow/filesystem/s3fs.cc
Outdated
| if (file_info.IsDirectory()) { | ||
| file_path = file_path + kSep; | ||
| } | ||
| file_paths.push_back(file_path); |
Contributor
There was a problem hiding this comment.
Now that file_path lives in a variable, you need to explicitly std::move it to avoid a copy when pushing:
file_paths.push_back(std::move(file_path));
felipecrv
approved these changes
Nov 23, 2023
felipecrv
approved these changes
Nov 24, 2023
raulcd
approved these changes
Dec 4, 2023
Member
raulcd
left a comment
There was a problem hiding this comment.
LGTM. I would like to merge this to be added to 14.0.2
pitrou
requested changes
Dec 4, 2023
pitrou
reviewed
Dec 5, 2023
Co-authored-by: Antoine Pitrou <pitrou@free.fr>
pitrou
reviewed
Dec 5, 2023
pitrou
approved these changes
Dec 5, 2023
felipecrv
approved these changes
Dec 5, 2023
|
After merging your PR, Conbench analyzed the 6 benchmarking runs that have been run so far on merge-commit cf80bd1. There were no benchmark performance regressions. 🎉 The full Conbench report has more details. It also includes information about 5 possible false positives for unstable benchmarks that are known to sometimes produce them. |
raulcd
pushed a commit
that referenced
this pull request
Dec 6, 2023
…reated sub-directories (#38845) ### Rationale for this change See #38618 (comment) and below for the analysis. When deleting the dir contents, we use a GetFileInfo with recursive FileSelector to list all objects to delete, but when doing that the file paths for directories don't end in a trailing `/`, so for deleting explicitly created directories we need to add the `kSep` here as well to properly delete the object. ### Are these changes tested? I tested them manually with an actual S3 bucket. The problem is that MinIO doesn't have the same problem, and so it's not actually tested with the test I added using our MinIO testing setup. ### Are there any user-facing changes? Fixes the regression * Closes: #38618 Lead-authored-by: Joris Van den Bossche <jorisvandenbossche@gmail.com> Co-authored-by: Antoine Pitrou <pitrou@free.fr> Signed-off-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
dgreiss
pushed a commit
to dgreiss/arrow
that referenced
this pull request
Feb 19, 2024
…itly created sub-directories (apache#38845) ### Rationale for this change See apache#38618 (comment) and below for the analysis. When deleting the dir contents, we use a GetFileInfo with recursive FileSelector to list all objects to delete, but when doing that the file paths for directories don't end in a trailing `/`, so for deleting explicitly created directories we need to add the `kSep` here as well to properly delete the object. ### Are these changes tested? I tested them manually with an actual S3 bucket. The problem is that MinIO doesn't have the same problem, and so it's not actually tested with the test I added using our MinIO testing setup. ### Are there any user-facing changes? Fixes the regression * Closes: apache#38618 Lead-authored-by: Joris Van den Bossche <jorisvandenbossche@gmail.com> Co-authored-by: Antoine Pitrou <pitrou@free.fr> Signed-off-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Rationale for this change
See #38618 (comment) and below for the analysis. When deleting the dir contents, we use a GetFileInfo with recursive FileSelector to list all objects to delete, but when doing that the file paths for directories don't end in a trailing
/, so for deleting explicitly created directories we need to add thekSephere as well to properly delete the object.Are these changes tested?
I tested them manually with an actual S3 bucket. The problem is that MinIO doesn't have the same problem, and so it's not actually tested with the test I added using our MinIO testing setup.
Are there any user-facing changes?
Fixes the regression