Remove UNKNOWN_NULL_COUNT#13372
Merged
rapids-bot[bot] merged 12 commits intorapidsai:branch-23.06from May 24, 2023
Merged
Conversation
mythrocks
added a commit
to mythrocks/spark-rapids-jni
that referenced
this pull request
May 17, 2023
This is in prep for rapidsai/cudf#11968 and rapidsai/cudf#13372. `libcudf` will soon require that all CUDF columns are created with a known null-count. `UNKNOWN_NULL_COUNT` will no longer be supported, or even available as a code constant. This change replicates part of rapidsai/cudf#13355, as it applies to `row_conversion.cu`. The (single) reference to the unknown-null-count is replaced with a pre-calculated value. Signed-off-by: MithunR <mythrocks@gmail.com>
mythrocks
added a commit
to NVIDIA/spark-rapids-jni
that referenced
this pull request
May 18, 2023
This is in prep for rapidsai/cudf#11968 and rapidsai/cudf#13372. `libcudf` will soon require that all CUDF columns are created with a known null-count. `UNKNOWN_NULL_COUNT` will no longer be supported, or even available as a code constant. This change replicates part of rapidsai/cudf#13355, as it applies to `row_conversion.cu`. The (single) reference to the unknown-null-count is replaced with a pre-calculated value. Signed-off-by: MithunR <mythrocks@gmail.com>
a2d444e to
85c146a
Compare
85c146a to
4fde571
Compare
Contributor
Author
|
There are a couple of Java failures left that I can reproduce locally. Will attempt to debug them myself, but may need to pull in some advice. |
davidwendt
reviewed
May 19, 2023
davidwendt
reviewed
May 19, 2023
davidwendt
reviewed
May 19, 2023
Contributor
Author
|
I'm blocking merging here until we've had a chance to check on the behavior of the Spark plugin with this change, but it is otherwise ready for review. |
ttnghia
reviewed
May 19, 2023
ttnghia
reviewed
May 19, 2023
ttnghia
approved these changes
May 19, 2023
mythrocks
added a commit
to mythrocks/spark-rapids-jni
that referenced
this pull request
May 19, 2023
This is a followup to NVIDIA#1148. `row_conversion.cu` was modified in rapidsai/cudf#13372 to explicitly calculate null-counts for output columns. This commit replicates the changes in cudf/pull/13372, and adds explicit null-count calculation for the string offsets column. Signed-off-by: MithunR <mythrocks@gmail.com>
karthikeyann
approved these changes
May 21, 2023
mythrocks
added a commit
to mythrocks/spark-rapids-jni
that referenced
this pull request
May 22, 2023
This is a followup to NVIDIA#1148. `row_conversion.cu` was modified in rapidsai/cudf#13372 to explicitly calculate null-counts for output columns. This commit replicates the changes in cudf/pull/13372, and adds explicit null-count calculation for the string offsets column. Signed-off-by: MithunR <mythrocks@gmail.com>
mythrocks
added a commit
to NVIDIA/spark-rapids-jni
that referenced
this pull request
May 23, 2023
* Followup for null count fixup in row_conversion.cu. This is a followup to #1148. `row_conversion.cu` was modified in rapidsai/cudf#13372 to explicitly calculate null-counts for output columns. This commit replicates the changes in cudf/pull/13372, and adds explicit null-count calculation for the string offsets column. Signed-off-by: MithunR <mythrocks@gmail.com>
Contributor
Author
|
/merge |
bdice
added a commit
to bdice/cudf
that referenced
this pull request
Jun 1, 2023
This reverts commit 56150d9.
3 tasks
rapids-bot bot
pushed a commit
that referenced
this pull request
Feb 27, 2026
Adds the `null_count()` member function back to the `cudf::column_device_view` class. This member function was removed in #4799 because the null count was unreliable since it could be set to `UNKNOWN_NULL_COUNT`. The `UNKNOWN_NULL_COUNT` was removed in #13372 so this should no longer be an issue. Although the value is now more reliable it is just a copy at the time the `cudf::column_device_view` was created so it is not without caveats though these are easily explained and hopefully understood. The value is much less reliable on `cudf::mutable_column_device_view` where the null-mask is more likely modified rendering the count invalid. Therefore, the `null_count()` is added only to the immutable `cudf::column_device_view`. The growing number of changes to move template parameters like `has_nulls` to runtime parameters reduces unneeded kernel compilations. Here is one example where having this member function would've helped to reduce the function parameters and more clearly identify the `has_nulls` parameter directly correlates to the `input` parameter state: https://github.com/rapidsai/cudf/blob/609b08e2c8075cad12347cede69d195fa5b186f1/cpp/src/rolling/detail/rolling_operators.cuh#L399-L416 Note the `has_nulls` parameter would not be needed since the `input` could contain this information and more clearly show what the parameter is based on. Authors: - David Wendt (https://github.com/davidwendt) Approvers: - Tianyu Liu (https://github.com/kingcrimsontianyu) - Nghia Truong (https://github.com/ttnghia) - Muhammad Haseeb (https://github.com/mhaseeb123) URL: #21430
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
This is the final PR for removing
UNKNOWN_NULL_COUNTand the implicit kernel launch in thenull_countmethods ofcolumnandcolumn_view.Depends on #13355 and #13341.
Closes #11968
Checklist