Skip to content

[R] Bind median() and quantile() to exact not approximate median and quantile #29619

@asfimport

Description

@asfimport

ARROW-13772 binds quantile() to tdigest() which returns approximate quantiles and binds median() to approximate_median() which returns an approximate median. The bindings issue a warning saying that the median/quantile is approximate. Once ARROW-13309 is implemented, modify the binding to call Arrow functions that returns exact quantiles and medians, and remove the warnings.

We should keep the approximate quantile and median bindings but rename them.

When doing this, we should also modify the bindings to accept type and interpolation arguments like we do in the quantile.ArrowDatum method:

arrow/r/R/compute.R

Lines 156 to 187 in 170a24f

quantile.ArrowDatum <- function(x,
probs = seq(0, 1, 0.25),
na.rm = FALSE,
type = 7,
interpolation = c("linear", "lower", "higher", "nearest", "midpoint"),
...) {
if (inherits(x, "Scalar")) x <- Array$create(x)
assert_is(probs, c("numeric", "integer"))
assert_that(length(probs) > 0)
assert_that(all(probs >= 0 & probs <= 1))
if (!na.rm && x$null_count > 0) {
stop("Missing values not allowed if 'na.rm' is FALSE", call. = FALSE)
}
if (type != 7) {
stop(
"Argument `type` not supported in Arrow. To control the quantile ",
"interpolation algorithm, set argument `interpolation` to one of: ",
"\"linear\" (the default), \"lower\", \"higher\", \"nearest\", or ",
"\"midpoint\".",
call. = FALSE
)
}
interpolation <- QuantileInterpolation[[toupper(match.arg(interpolation))]]
out <- call_function("quantile", x, options = list(q = probs, interpolation = interpolation))
if (length(out) == 0) {
# When there are no non-missing values in the data, the Arrow quantile
# function returns an empty Array, but for consistency with the R quantile
# function, we want an Array of NA_real_ with the same length as probs
out <- Array$create(rep(NA_real_, length(probs)))
}
out
}

Reporter: Ian Cook / @ianmcook

Related issues:

Note: This issue was originally created as ARROW-14021. Please see the migration documentation for further details.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions