Skip to content

ARROW-18436: [C++] Ensure correct (un)escaping of special characters in URI paths#14974

Merged
pitrou merged 4 commits intoapache:masterfrom
pitrou:ARROW-18436-uri-path-special-chars
Jan 3, 2023
Merged

ARROW-18436: [C++] Ensure correct (un)escaping of special characters in URI paths#14974
pitrou merged 4 commits intoapache:masterfrom
pitrou:ARROW-18436-uri-path-special-chars

Conversation

@pitrou
Copy link
Copy Markdown
Member

@pitrou pitrou commented Dec 15, 2022

No description provided.

@github-actions
Copy link
Copy Markdown

@pitrou
Copy link
Copy Markdown
Member Author

pitrou commented Dec 15, 2022

@westonpace @vibhatha This may affect Substrait, though it should be for the better.

@pitrou
Copy link
Copy Markdown
Member Author

pitrou commented Dec 15, 2022

@jorisvandenbossche @AlenkaF This sanitizes the filesystem_from_uri behavior (see unit tests).

@pitrou pitrou requested a review from lidavidm December 15, 2022 16:05
@vibhatha
Copy link
Copy Markdown
Contributor

@westonpace @vibhatha This may affect Substrait, though it should be for the better.

looks good to me, @westonpace do you see any issues?

Copy link
Copy Markdown
Member

@westonpace westonpace left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree it's for the better, thanks. A few small thoughts.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor nit: I think we've been preferring to pass string_view by value in Substrait per https://quuxplusone.github.io/blog/2021/11/09/pass-string-view-by-value/ and @bkietz 's advice

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, good point, thanks.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I opened #15171 for other instances of this.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor nit: Maybe a comment explaining this math? Probably not needed if this is a normal Windows thing.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is simply as per the uriWindowsFilenameToUriStringA doc:
https://uriparser.github.io/doc/api/latest/Uri_8h.html#a422dc4a2b979ad380a4dfe007e3de845

@pitrou pitrou force-pushed the ARROW-18436-uri-path-special-chars branch from 4f620c5 to 91caabb Compare January 3, 2023 16:23
@pitrou pitrou merged commit ceec795 into apache:master Jan 3, 2023
@pitrou pitrou deleted the ARROW-18436-uri-path-special-chars branch January 3, 2023 17:24
@ursabot
Copy link
Copy Markdown

ursabot commented Jan 4, 2023

Benchmark runs are scheduled for baseline = 793e5f6 and contender = ceec795. ceec795 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Finished ⬇️0.0% ⬆️0.0%] ec2-t3-xlarge-us-east-2
[Failed ⬇️0.56% ⬆️0.0%] test-mac-arm
[Finished ⬇️4.08% ⬆️0.0%] ursa-i9-9960x
[Failed ⬇️0.0% ⬆️0.0%] ursa-thinkcentre-m75q
Buildkite builds:
[Finished] ceec7950 ec2-t3-xlarge-us-east-2
[Failed] ceec7950 test-mac-arm
[Finished] ceec7950 ursa-i9-9960x
[Failed] ceec7950 ursa-thinkcentre-m75q
[Finished] 793e5f62 ec2-t3-xlarge-us-east-2
[Failed] 793e5f62 test-mac-arm
[Finished] 793e5f62 ursa-i9-9960x
[Failed] 793e5f62 ursa-thinkcentre-m75q
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

@ursabot
Copy link
Copy Markdown

ursabot commented Jan 4, 2023

['Python', 'R'] benchmarks have high level of regressions.
ursa-i9-9960x

EpsilonPrime pushed a commit to EpsilonPrime/arrow that referenced this pull request Jan 5, 2023
…in URI paths (apache#14974)

Authored-by: Antoine Pitrou <antoine@python.org>
Signed-off-by: Antoine Pitrou <antoine@python.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants