Skip to content

Conversation

@coryan
Copy link
Contributor

@coryan coryan commented Dec 2, 2021

This PR implements GcsFileSystem::CreateDir. I also reviewed the
documentation for the class, since now the implementation of the class,
particularly how directories are emulated, is more than just notes and
ideas.

This PR implements `GcsFileSystem::CreateDir`. I also reviewed the
documentation for the class, since now the implementation of the class,
particularly how directories are emulated, is more than just notes and
ideas.
@github-actions
Copy link

github-actions bot commented Dec 2, 2021

@coryan coryan marked this pull request as ready for review December 2, 2021 22:21
@coryan
Copy link
Contributor Author

coryan commented Dec 4, 2021

The failure on R / AMD64 Windows R RTools 40 seems unrelated, please take a look.

@pitrou pitrou changed the title ARROW-14917: [C++] GcsFileSystem can create directories ARROW-14917: [C++] Implement GcsFileSystem::CreateDir Dec 7, 2021
Copy link
Member

@pitrou pitrou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much. A couple questions and suggestions below.

/// directory.
/// - The class creates marker objects for a directory, using a trailing slash in the
/// marker names. For debugging purposes, the metadata and contents of these marker
/// objects indicate that they are markers created by this class. The class does
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't there any de facto standard for directory metadata markers?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kind of? The UI in for Google Cloud creates empty objects ending with /. The command-line utility does no such thing (e.g. gsutil cp -r deep-directory/ gs://my-bucket/foo would not create these markers). The client libraries do not create them either. Whether that amounts to a de facto standard, I cannot say.

Frankly the use of "folder emulation" causes more harm than good. It creates the impression that some things should work (e.g. directory renames, directory permissions, efficient non-recursive listing) when they don't. And the markers are sort of useless. If you want to list objects non-recursively, the API has native support for including any matching prefixes in the results, without the need for these markers 🤷

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, sorry, my question was about the custom metadata you added to these markers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahhhhh. No, the UI does not create any metadata, but ignores it too. I think the metadata is harmless and useful for debugging.

@coryan
Copy link
Contributor Author

coryan commented Dec 7, 2021

Please take another look.

@pitrou pitrou closed this in 01186fc Dec 7, 2021
@ursabot
Copy link

ursabot commented Dec 7, 2021

Benchmark runs are scheduled for baseline = 77722d9 and contender = 01186fc. 01186fc is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Finished ⬇️0.0% ⬆️0.0%] ec2-t3-xlarge-us-east-2
[Scheduled] ursa-i9-9960x
[Scheduled] ursa-thinkcentre-m75q
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python. Runs only benchmarks with cloud = True
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

@coryan coryan deleted the ARROW-14917-gcsfs-create-dir branch December 7, 2021 19:58
@ursabot
Copy link

ursabot commented Dec 8, 2021

Benchmark runs are scheduled for baseline = 77722d9 and contender = 01186fc. 01186fc is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Finished ⬇️0.0% ⬆️0.0%] ec2-t3-xlarge-us-east-2
[Failed ⬇️1.48% ⬆️0.0%] ursa-i9-9960x
[Finished ⬇️0.22% ⬆️0.09%] ursa-thinkcentre-m75q
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python. Runs only benchmarks with cloud = True
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants