Skip to content

Fix percentile inconsistencies#12088

Merged
jacobtomlinson merged 10 commits intodask:mainfrom
Oisin-M:fix/percentile_inconsistency
Oct 8, 2025
Merged

Fix percentile inconsistencies#12088
jacobtomlinson merged 10 commits intodask:mainfrom
Oisin-M:fix/percentile_inconsistency

Conversation

@Oisin-M
Copy link
Copy Markdown
Contributor

@Oisin-M Oisin-M commented Sep 30, 2025

This PR makes some small changes to the percentile function.

  • simplifies logic slightly and adds minor optimisations
  • fixes issue with inconsistent output shapes (does a .reshape(()) if required)
  • corrects some erroneous docstrings

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Sep 30, 2025

Unit Test Results

See test report for an extended history of previous test failures. This is useful for diagnosing flaky tests.

      9 files  ±  0        9 suites  ±0   3h 12m 38s ⏱️ + 2m 42s
 18 101 tests + 28   16 884 ✅ + 28   1 217 💤 ±0  0 ❌ ±0 
161 983 runs  +252  149 867 ✅ +254  12 116 💤  - 2  0 ❌ ±0 

Results for commit a643500. ± Comparison against base commit 4736a38.

This pull request removes 8 and adds 36 tests. Note that renamed tests count towards both.
dask.array.tests.test_percentiles ‑ test_percentiles_with_scaler_percentile[5-dask]
dask.array.tests.test_percentiles ‑ test_percentiles_with_scaler_percentile[5-tdigest]
dask.array.tests.test_percentiles ‑ test_percentiles_with_scaler_percentile[5.0_0-dask]
dask.array.tests.test_percentiles ‑ test_percentiles_with_scaler_percentile[5.0_0-tdigest]
dask.array.tests.test_percentiles ‑ test_percentiles_with_scaler_percentile[5.0_1-dask]
dask.array.tests.test_percentiles ‑ test_percentiles_with_scaler_percentile[5.0_1-tdigest]
dask.array.tests.test_percentiles ‑ test_percentiles_with_scaler_percentile[q2-dask]
dask.array.tests.test_percentiles ‑ test_percentiles_with_scaler_percentile[q2-tdigest]
dask.array.tests.test_percentiles ‑ test_percentiles_with_scaler_percentile[x0-5-dask]
dask.array.tests.test_percentiles ‑ test_percentiles_with_scaler_percentile[x0-5-tdigest]
dask.array.tests.test_percentiles ‑ test_percentiles_with_scaler_percentile[x0-5.0_0-dask]
dask.array.tests.test_percentiles ‑ test_percentiles_with_scaler_percentile[x0-5.0_0-tdigest]
dask.array.tests.test_percentiles ‑ test_percentiles_with_scaler_percentile[x0-5.0_1-dask]
dask.array.tests.test_percentiles ‑ test_percentiles_with_scaler_percentile[x0-5.0_1-tdigest]
dask.array.tests.test_percentiles ‑ test_percentiles_with_scaler_percentile[x0-50-dask]
dask.array.tests.test_percentiles ‑ test_percentiles_with_scaler_percentile[x0-50-tdigest]
dask.array.tests.test_percentiles ‑ test_percentiles_with_scaler_percentile[x0-95-dask]
dask.array.tests.test_percentiles ‑ test_percentiles_with_scaler_percentile[x0-95-tdigest]
…

♻️ This comment has been updated with latest results.

Copy link
Copy Markdown
Member

@jacobtomlinson jacobtomlinson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall I think this looks great.

Do you think you could add a little more testing? For example in #11989 there is an example of some assertions that were failing. Could you update the tests to demonstrate they now pass?

@Oisin-M
Copy link
Copy Markdown
Contributor Author

Oisin-M commented Oct 7, 2025

Hi @jacobtomlinson, I already updated the test_percentiles_with_scaler_percentile to test for the issue in #11989 but I went and added some further tests now too. We only get approximate matches with numpy's percentiles by algorithmic design so I've added a very lenient relative tolerance to check against it.

Copy link
Copy Markdown
Member

@jacobtomlinson jacobtomlinson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Much appreciated. I guess I was thinking about the explicit shape comparison in the issue, but asserting equality should do the same and more.

@jacobtomlinson jacobtomlinson merged commit a32b51d into dask:main Oct 8, 2025
22 of 24 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

An inconsistency between the documentation of dask.array.percentile and code implementation

2 participants