Skip to content

Conversation

@thorfour
Copy link
Contributor

@thorfour thorfour commented May 21, 2025

Which issue does this PR close?

Rationale for this change

This prevents a panic that can happen with empty struct arrays.

What changes are included in this PR?

Small unit test that exacerbates the panic by attempting to filter a struct array with no columns.
Fix was to call StructArray:new_unchecked_with_length since the predicate of the filter contains the number rows the array requires.

Are there any user-facing changes?

No

@github-actions github-actions bot added the arrow Changes to the arrow crate label May 21, 2025
@thorfour thorfour force-pushed the filter-empty-record branch from 9c9c598 to 7587b6d Compare May 21, 2025 20:29
thorfour added 2 commits May 21, 2025 15:31
exists from the predicate. This prevents panics in situations where a
stucts array has empty columns.
@thorfour thorfour force-pushed the filter-empty-record branch from 7587b6d to bbad99f Compare May 21, 2025 20:31
@brancz
Copy link
Contributor

brancz commented May 21, 2025

cc @alamb

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @thorfour and @brancz -- this makes sense to me

}

#[test]
fn test_filter_empty_struct() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I verified that this test fails without the code in this PR:

thread 'filter::tests::test_filter_empty_struct' panicked at arrow-array/src/array/struct_array.rs:224:51:
cannot use StructArray::new_unchecked if there are no fields, length is unknown
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

I think this is additional fallout from a new panic we introduced in this PR from @westonpace

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's correct however without that panic the bug was much more subtle as it created an invalid record batch that would panic when slicing elsewhere.

@alamb alamb changed the title Filter empty record Fix filtering empty StructArrays May 21, 2025
@alamb alamb changed the title Fix filtering empty StructArrays Fix filter_record_batch panics with empty struct array. #7538 May 21, 2025
@alamb alamb changed the title Fix filter_record_batch panics with empty struct array. #7538 Fix filter_record_batch panics with empty struct array May 21, 2025
@alamb alamb merged commit e9df239 into apache:main May 22, 2025
29 of 30 checks passed
@alamb
Copy link
Contributor

alamb commented May 22, 2025

Thanks again @thorfour

@thorfour thorfour deleted the filter-empty-record branch May 22, 2025 18:35
thorfour added a commit to polarsignals/arrow-rs that referenced this pull request May 27, 2025
* unit test: filter empty struct

* When filtering struct Array's always build with a length since one
exists from the predicate. This prevents panics in situations where a
stucts array has empty columns.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

arrow Changes to the arrow crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

filter_record_batch panics with empty struct array.

3 participants