[C++][R] Option for is_null(NaN) to evaluate to true

(This is the flip side of ARROW-12960.)

Currently the Arrow compute kernel `is_null` always treats `NaN` as a non-missing value, returning `false` at positions of the input datum with value `NaN`.

It would be helpful to be able to control this behavior with an option. The option could be named `nan_is_null` or something similar.  It would default to `false`, consistent with current behavior. When set to `true`, it should check if the input datum has a floating point data type, and if so, return `true` at positions where the input is `NaN`. If the input datum has some other type, the option should be silently ignored.

Among other things, this would enable the `arrow` R package to evaluate `is.na()` consistently with the way base R does. In base R, `is.na()` returns `TRUE` on `NaN`. But in the `arrow` R package, it returns `FALSE`:
```r

is.na(c(3.14, NA, NaN))
## [1] FALSE TRUE TRUE

as.vector(is.na(Array$create(c(3.14, NA, NaN))))
## [1] FALSE TRUE FALSE
```
I think solving this with an option in the C++ kernel is the best solution, because I suspect there are other cases in which users might want to treat `NaN` as a missing value. However, it would also be possible to solve this just in the R package, by defining a mapping of `is.na` in the R package that checks if the input `x` has a floating point data type, and if so, evaluates `is.na$x$ | is.nan$x$`. If we choose to go that route, we should change this Jira issue summary to "[R] Make is.na(NaN) consistent with base R".

**Reporter**: [Ian Cook](https://issues.apache.org/jira/browse/ARROW-12959) / @ianmcook
**Assignee**: [Christian Cordova](https://issues.apache.org/jira/browse/ARROW-12959) / @Christian8491
#### Related issues:
- [[C++] Add option to is_null kernel to return true on NaN](https://github.com/apache/arrow/issues/29041) (is duplicated by)
- [[C++][R] Option for is_nan(null) to evaluate to false or true](https://github.com/apache/arrow/issues/28681) (is related to)
#### PRs and other links:
- [GitHub Pull Request #10896](https://github.com/apache/arrow/pull/10896)

<sub>**Note**: *This issue was originally created as [ARROW-12959](https://issues.apache.org/jira/browse/ARROW-12959). Please see the [migration documentation](https://github.com/apache/arrow/issues/14542) for further details.*</sub>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[C++][R] Option for is_null(NaN) to evaluate to true #28680

Related issues:

PRs and other links:

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[C++][R] Option for is_null(NaN) to evaluate to true #28680

Description

Related issues:

PRs and other links:

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions