Skip to content

fix: CometTakeOrderedAndProjectExec native scan node should use child operator's output#896

Merged
viirya merged 1 commit intoapache:mainfrom
viirya:fix_takeordered
Aug 30, 2024
Merged

fix: CometTakeOrderedAndProjectExec native scan node should use child operator's output#896
viirya merged 1 commit intoapache:mainfrom
viirya:fix_takeordered

Conversation

@viirya
Copy link
Copy Markdown
Member

@viirya viirya commented Aug 30, 2024

Which issue does this PR close?

Closes #.

Rationale for this change

What changes are included in this PR?

This bug was found while debugging CI failure in #893. In CometTakeOrderedAndProjectExec, we create internal native plan to execute native limit + sort + project. The pseudo scan node created there was incorrectly using CometTakeOrderedAndProjectExec's output attributes but it should be child node's output.

Currently it doesn't cause any error, although it will have incorrect schema in the scan node. Because sort/limit simply takes input so the incorrect schema doesn't make error. For project, we already bind attributes in Spark, as the input data is correct so it has no error too.

But in #893, we need to get the number of columns from the schema of scan node. If the schema is incorrect, the scan node will create incorrect number of array/schema structures which cause error later.

How are these changes tested?

@viirya viirya requested review from andygrove and huaxingao August 30, 2024 17:26
@viirya
Copy link
Copy Markdown
Member Author

viirya commented Aug 30, 2024

Thanks @andygrove

@viirya viirya deleted the fix_takeordered branch August 30, 2024 18:34
coderfender pushed a commit to coderfender/datafusion-comet that referenced this pull request Dec 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants