Skip to content

[C++][Acero] More tests for row segmenter #44167

@zanmato1984

Description

@zanmato1984

Describe the enhancement requested

Current tests are not covering fixed size binary and dictionary types:

TEST(RowSegmenter, ConstantArrayBatch) {
TestRowSegmenterConstantBatch([](size_t i) { return ArgShape::ARRAY; },
MakeRowSegmenter);
}
TEST(RowSegmenter, ConstantScalarBatch) {
TestRowSegmenterConstantBatch([](size_t i) { return ArgShape::SCALAR; },
MakeRowSegmenter);
}
TEST(RowSegmenter, ConstantMixedBatch) {
TestRowSegmenterConstantBatch(
[](size_t i) { return i % 2 == 0 ? ArgShape::SCALAR : ArgShape::ARRAY; },
MakeRowSegmenter);
}
TEST(RowSegmenter, ConstantArrayBatchWithAnyKeysSegmenter) {
TestRowSegmenterConstantBatch([](size_t i) { return ArgShape::ARRAY; },
MakeGenericSegmenter);
}
TEST(RowSegmenter, ConstantScalarBatchWithAnyKeysSegmenter) {
TestRowSegmenterConstantBatch([](size_t i) { return ArgShape::SCALAR; },
MakeGenericSegmenter);
}
TEST(RowSegmenter, ConstantMixedBatchWithAnyKeysSegmenter) {
TestRowSegmenterConstantBatch(
[](size_t i) { return i % 2 == 0 ? ArgShape::SCALAR : ArgShape::ARRAY; },
MakeGenericSegmenter);
}
TEST(RowSegmenter, RowConstantBatch) {
constexpr size_t n = 3;
std::vector<TypeHolder> types = {int32(), int32(), int32()};
auto full_batch = ExecBatchFromJSON(types, "[[1, 1, 1], [2, 2, 2], [3, 3, 3]]");
std::vector<Segment> expected_segments_for_size_0 = {{0, 3, true, true}};
std::vector<Segment> expected_segments = {
{0, 1, false, true}, {1, 1, false, false}, {2, 1, true, false}};
auto test_by_size = [&](size_t size) -> Status {
SCOPED_TRACE("constant-batch with " + ToChars(size) + " key(s)");
std::vector<Datum> values(full_batch.values.begin(),
full_batch.values.begin() + size);
ExecBatch batch(values, full_batch.length);
std::vector<TypeHolder> key_types(types.begin(), types.begin() + size);
ARROW_ASSIGN_OR_RAISE(auto segmenter, MakeRowSegmenter(key_types));
TestSegments(segmenter, ExecSpan(batch),
size == 0 ? expected_segments_for_size_0 : expected_segments);
return Status::OK();
};
for (size_t i = 0; i <= n; i++) {
ASSERT_OK(test_by_size(i));
}
}

which are clearly in the supported type list:

if (type != NULLPTR && is_fixed_width(*type)) {

(Both fixed size binary and dictionary are "fixed width".)

Component(s)

C++

Metadata

Metadata

Assignees

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions