Skip to content

GH-34248: [Python] Expose the order_by node#34654

Merged
jorisvandenbossche merged 2 commits intoapache:mainfrom
jorisvandenbossche:feature/GH-34248--add-order-by-node
Mar 22, 2023
Merged

GH-34248: [Python] Expose the order_by node#34654
jorisvandenbossche merged 2 commits intoapache:mainfrom
jorisvandenbossche:feature/GH-34248--add-order-by-node

Conversation

@jorisvandenbossche
Copy link
Copy Markdown
Member

@jorisvandenbossche jorisvandenbossche commented Mar 21, 2023

Adds Python bindings for the OrderByNode added in #34249

@jorisvandenbossche jorisvandenbossche force-pushed the feature/GH-34248--add-order-by-node branch from aa92b52 to 8672fc5 Compare March 21, 2023 09:12
@github-actions
Copy link
Copy Markdown

Copy link
Copy Markdown
Member

@AlenkaF AlenkaF left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very cool =)
I have tried it locally, works great. Can't think of anything to add to the tests.

I am a bit unsure about the reason for the change in CSortKey class though. This class was only used in Select*, Rank* and SortOptions where, I guess, a field reference could have only been a string column name.

In OrderByNodeOptions it can also be an integer, or an expression. Then, I think, this should also be added in the docstrings for other classes that use sort_keys and add this cases to the tests? (maybe not as a part of this PR)

@jorisvandenbossche
Copy link
Copy Markdown
Member Author

I am a bit unsure about the reason for the change in CSortKey class though. This class was only used in Select*, Rank* and SortOptions where, I guess, a field reference could have only been a string column name.

Yeah, this was actually changed a while ago in C++, but the cython bindings exposing it as CSortKey(c_string name, ..) still worked fine because there is an implicit conversion from string to FieldRef possible.
For this PR however, I needed to create a SortKey from a FieldRef (to be consistent with the rest of the node options), and so I needed to update the constructor to CSortKey(CFieldRef ..), which also meant updating it in the other places.

Then, I think, this should also be added in the docstrings for other classes that use sort_keys and add this cases to the tests? (maybe not as a part of this PR)

Good point. Updated the docstrings and tests

@jorisvandenbossche jorisvandenbossche merged commit 532b9a5 into apache:main Mar 22, 2023
@jorisvandenbossche jorisvandenbossche deleted the feature/GH-34248--add-order-by-node branch March 22, 2023 13:57
@ursabot
Copy link
Copy Markdown

ursabot commented Mar 22, 2023

Benchmark runs are scheduled for baseline = 9288275 and contender = 532b9a5. 532b9a5 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Finished ⬇️0.0% ⬆️0.0%] ec2-t3-xlarge-us-east-2
[Finished ⬇️0.3% ⬆️0.0%] test-mac-arm
[Finished ⬇️0.0% ⬆️0.0%] ursa-i9-9960x
[Finished ⬇️0.13% ⬆️0.0%] ursa-thinkcentre-m75q
Buildkite builds:
[Finished] 532b9a57 ec2-t3-xlarge-us-east-2
[Finished] 532b9a57 test-mac-arm
[Finished] 532b9a57 ursa-i9-9960x
[Finished] 532b9a57 ursa-thinkcentre-m75q
[Finished] 92882751 ec2-t3-xlarge-us-east-2
[Finished] 92882751 test-mac-arm
[Finished] 92882751 ursa-i9-9960x
[Finished] 92882751 ursa-thinkcentre-m75q
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

@ursabot
Copy link
Copy Markdown

ursabot commented Mar 22, 2023

['Python', 'R'] benchmarks have high level of regressions.
test-mac-arm

ArgusLi pushed a commit to Bit-Quill/arrow that referenced this pull request May 15, 2023
Adds Python bindings for the OrderByNode added in apache#34249 
* Closes: apache#34248

Authored-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
Signed-off-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[C++] Add an order_by node which can reassign an ordering mid-plan

3 participants