Skip to content

Conversation

@zagto
Copy link
Contributor

@zagto zagto commented May 12, 2022

No description provided.

@github-actions
Copy link

@github-actions
Copy link

⚠️ Ticket has not been started in JIRA, please click 'Start Progress'.

@zagto zagto marked this pull request as draft May 12, 2022 22:07
@westonpace westonpace self-requested a review May 31, 2022 17:51
Copy link
Member

@westonpace westonpace left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a straightforward improvement. Is this meant to be in a draft state?

Comment on lines +315 to +321
if (args_[i].kind() == Datum::CHUNKED_ARRAY) {
const ChunkedArray& carr = *args_[i].chunked_array();
batch->values[i] = Datum(carr.chunk(chunk_indexes_[i])->data());
chunk_positions_[i] += iteration_size;
} else {
batch->values[i] = std::move(args_[i]);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (args_[i].kind() == Datum::CHUNKED_ARRAY) {
const ChunkedArray& carr = *args_[i].chunked_array();
batch->values[i] = Datum(carr.chunk(chunk_indexes_[i])->data());
chunk_positions_[i] += iteration_size;
} else {
batch->values[i] = std::move(args_[i]);
}
if (args_[i].kind() == Datum::CHUNKED_ARRAY) {
chunk_positions_[i] += iteration_size;
}
batch->values[i] = std::move(args_[i]);

I'm not entirely certain this is correct but if iteration_size == length_ I think that means you are guaranteed that any chunked arrays are a single chunk (or at least, there is only one non-zero size chunk) and so you are consuming it all at once. I'm not even sure you need to update chunk_positions_[i].

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're right about not having to update chunk_positions_[i]. But the suggested change wouldn't work. Moving the chunked array Datum directly into batch->values means it ends up getting passed to the kernel, which usually expects an ARRAY type Datum. The important part why we need a branch here is extracting the ArrayData from the chunked array

@zagto zagto marked this pull request as ready for review June 1, 2022 12:34
@wesm
Copy link
Member

wesm commented Jun 13, 2022

I suggest abandoning this PR since I'm going to remove ExecBatchIterator soon after #13364 is merged

@zagto
Copy link
Contributor Author

zagto commented Jun 13, 2022

I suggest abandoning this PR since I'm going to remove ExecBatchIterator soon after #13364 is merged

Makes sense, I'll close it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants