Skip to content

GH-31880: [Python] Table.filter with expression now preserves order with use_threads=True#34766

Merged
jorisvandenbossche merged 2 commits intoapache:mainfrom
jorisvandenbossche:gh-31880
Apr 4, 2023
Merged

GH-31880: [Python] Table.filter with expression now preserves order with use_threads=True#34766
jorisvandenbossche merged 2 commits intoapache:mainfrom
jorisvandenbossche:gh-31880

Conversation

@jorisvandenbossche
Copy link
Copy Markdown
Member

@jorisvandenbossche jorisvandenbossche commented Mar 29, 2023

Rationale for this change

Thanks to #34137, the ExecPlan now has a concept of ordering. When the source node is a Table, the order of the batches in the table is used as the implicit order. And when executing a plan and producing a resulting Table, the default for QueryOptions' sequence_output is to honor an order if there is one.

Given that the Table.filter method only consists of a table source node (which adds implicit order) and a filter node (which preserves any ordering), the output will now always be ordered by default, also with the default of use_threads=True

Are these changes tested?

The existing test test_exec_plan.py::test_filter_table_ordering still passes.

@github-actions
Copy link
Copy Markdown

@jorisvandenbossche jorisvandenbossche merged commit c074e33 into apache:main Apr 4, 2023
@jorisvandenbossche jorisvandenbossche deleted the gh-31880 branch April 4, 2023 12:32
@ursabot
Copy link
Copy Markdown

ursabot commented Apr 4, 2023

Benchmark runs are scheduled for baseline = d6d824d and contender = c074e33. c074e33 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Finished ⬇️0.0% ⬆️0.0%] ec2-t3-xlarge-us-east-2
[Failed ⬇️0.56% ⬆️0.0%] test-mac-arm
[Finished ⬇️0.0% ⬆️0.0%] ursa-i9-9960x
[Failed ⬇️0.0% ⬆️0.0%] ursa-thinkcentre-m75q
Buildkite builds:
[Finished] c074e334 ec2-t3-xlarge-us-east-2
[Failed] c074e334 test-mac-arm
[Finished] c074e334 ursa-i9-9960x
[Failed] c074e334 ursa-thinkcentre-m75q
[Finished] d6d824d1 ec2-t3-xlarge-us-east-2
[Failed] d6d824d1 test-mac-arm
[Finished] d6d824d1 ursa-i9-9960x
[Failed] d6d824d1 ursa-thinkcentre-m75q
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

@ursabot
Copy link
Copy Markdown

ursabot commented Apr 4, 2023

['Python', 'R'] benchmarks have high level of regressions.
test-mac-arm

ArgusLi pushed a commit to Bit-Quill/arrow that referenced this pull request May 15, 2023
…rder with use_threads=True (apache#34766)

### Rationale for this change

Thanks to apache#34137, the ExecPlan now has a concept of ordering. When the source node is a Table, the order of the batches in the table is used as the implicit order. And when executing a plan and producing a resulting Table, the default for QueryOptions' `sequence_output` is to honor an order if there is one. 

Given that the `Table.filter` method only consists of a table source node (which adds implicit order) and a filter node (which preserves any ordering), the output will now always be ordered by default, also with the default of `use_threads=True`

### Are these changes tested?

The existing test `test_exec_plan.py::test_filter_table_ordering` still passes.

* Closes: apache#31880

Authored-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
Signed-off-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Python] Ensure _exec_plan.execplan preserves order of inputs

2 participants