Skip to content

An inconsistency between the documentation of dask.array.percentile and code implementation #11336

@ParsifalXu

Description

@ParsifalXu

Describe the issue:

As mentioned in the parameter method in the documentation of dask.array.percentile:

method{‘linear’, ‘lower’, ‘higher’, ‘midpoint’, ‘nearest’}, optional
The interpolation method to use when the desired percentile lies between two data points i < j. Only valid for internal_method='dask'.

However, Corresponding part in the source code:

if (
    internal_method == "tdigest"
    and method == "linear"
    and (np.issubdtype(dtype, np.floating) or np.issubdtype(dtype, np.integer))
):
    from dask.utils import import_required
    import_required(
        "crick", "crick is a required dependency for using the t-digest method."
    )
    name = "percentile_tdigest_chunk-" + token
    dsk = {
        (name, i): (_tdigest_chunk, key) for i, key in enumerate(a.__dask_keys__())
    }
    name2 = "percentile_tdigest-" + token
    dsk2 = {(name2, 0): (_percentiles_from_tdigest, q, sorted(dsk))}

Apparently, method is only valid for internal_method='dask', but also valid for internal_method='tdigest'.

Maybe you can check it and improve the documentation.

Metadata

Metadata

Assignees

No one assigned

    Labels

    arraydocumentationImprove or add to documentationneeds attentionIt's been a while since this was pushed on. Needs attention from the owner or a maintainer.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions