Remove all references to UNKNOWN_NULL_COUNT in Python#13345
Remove all references to UNKNOWN_NULL_COUNT in Python#13345rapids-bot[bot] merged 8 commits intorapidsai:branch-23.06from
Conversation
Co-authored-by: Nghia Truong <7416935+ttnghia@users.noreply.github.com>
shwina
left a comment
There was a problem hiding this comment.
This looks good to me, but do you think it makes sense for libcudf to provide a utility for computing the null count given a base mask and an offset? (and could it potentially do that without having to copy any data?)
Great question! I tried to stick to the simplest rewrite of the existing code path as possible, but in fact the copying is not at all necessary. The API you want is what I already exposed in this PR, I just need to use it. Making that change now. |
bdice
left a comment
There was a problem hiding this comment.
Two small questions/comments. Resolve as you see fit. :)
|
/merge |
Fixes #13353. Depends on #13345. In preparation for #11968, this change ensures that columns constructed from CUDF JNI do not have their null counts set to `UNKNOWN_NULL_COUNT` (i.e. `-1`). In cases where the caller invokes JNI functions with `UNKNOWN_NULL_COUNT`, the JNI layer computes the concrete null count from the validity mask, and sets this value in the column. The current Java API remains unchanged; there should be no impact to user code. The option to specify an optional null count through the Java API will likely be removed at a later date. Signed-off-by: MithunR <mythrocks@gmail.com> Authors: - MithunR (https://github.com/mythrocks) - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Jason Lowe (https://github.com/jlowe) - Nghia Truong (https://github.com/ttnghia) URL: #13355
Description
Part of #11968
Checklist