Skip to content

Conversation

@alamb
Copy link
Contributor

@alamb alamb commented Jul 24, 2025

Which issue does this PR close?

Rationale for this change

I was trying to write some tests for #16881 from @berkaysynnada but I hit #16899. So let's add some tests to show what is happening

What changes are included in this PR?

Add a new partial_sorts.slt test showing when PartialSort should be being used

Are these changes tested?

Only tests

Are there any user-facing changes?

No

@berkaysynnada
Copy link
Contributor

Do you want to see some PartialSortExec in these new tests?

01)Sort: data.a ASC NULLS LAST, data.b ASC NULLS LAST
02)--TableScan: data projection=[a, b]
physical_plan
01)SortExec: expr=[a@0 ASC NULLS LAST, b@1 ASC NULLS LAST], preserve_partitioning=[false]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want to see some PartialSortExec in these new tests?

@berkaysynnada Yes, I expect to see PartialSortExec here specifically

5

statement ok
CREATE EXTERNAL TABLE data (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

then you need to create the table as "CREATE UNBOUNDED EXTERNAL TABLE"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason not to use PartialSortExec for normal tables? It seems like it is more efficient for bounded tables too (it doesn't have to buffer the entire input, for example)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems @EeshanBembi has already created a PR to use PartialSortExec in this case

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a reason not to use PartialSortExec for normal tables? It seems like it is more efficient for bounded tables too (it doesn't have to buffer the entire input, for example)

I'm not sure about the efficiency. If there is no spilling work etc., collecting all batches and sorting them in one shot could be better, but needs a benchmark to be sure

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense -- thank you for the clarification

@alamb
Copy link
Contributor Author

alamb commented Aug 10, 2025

run extended tests

@github-actions
Copy link

Thank you for your contribution. Unfortunately, this pull request is stale because it has been open 60 days with no activity. Please remove the stale label or comment or this will be closed in 7 days.

@github-actions github-actions bot added the Stale PR has not had any activity for some time label Oct 29, 2025
@github-actions github-actions bot closed this Nov 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

sqllogictest SQL Logic Tests (.slt) Stale PR has not had any activity for some time

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants