groupby: Dispatch quantile to flox.#8720
Conversation
9aa876d to
ef91cf0
Compare
|
@dcherian It works for the very simple test I did. We have fewer One test in xclim's CI where we perform a grouped quantile failed when I modified it to use a chunked array : |
|
It should do that if you explicitly pass But great catch! I'll have to add a fall back. I haven't thought about how to infer that |
When you do this, is the time axis a single chunk? |
That's the "custom implementation" part : we manipulate the array and rechunk so that all members of one group (a day of year plus a window) fall into the same chunk. |
|
Can you link to that section of the code please? |
|
The function is here : https://github.com/Ouranosinc/xclim/blob/acbccb8902fb05318c53d76beb1c97067dfb8f11/xclim/core/calendar.py#L627 |
* main: (42 commits) correctly encode/decode _FillValues/missing_values/dtypes for packed data (pydata#8713) Expand use of `.oindex` and `.vindex` (pydata#8790) Return a dataclass from Grouper.factorize (pydata#8777) [skip-ci] Fix upstream-dev env (pydata#8839) Add dask-expr for windows envs (pydata#8837) [skip-ci] Add dask-expr dependency to doc.yml (pydata#8835) Add `dask-expr` to environment-3.12.yml (pydata#8827) Make list_chunkmanagers more resilient to broken entrypoints (pydata#8736) Do not attempt to broadcast when global option ``arithmetic_broadcast=False`` (pydata#8784) try to get the `upstream-dev` CI to complete again (pydata#8823) Bump the actions group with 1 update (pydata#8818) Update documentation for clarity (pydata#8817) DOC: link to zarr.convenience.consolidate_metadata (pydata#8816) Refactor Grouper objects (pydata#8776) Grouper object design doc (pydata#8510) Bump the actions group with 2 updates (pydata#8804) tokenize() should ignore difference between None and {} attrs (pydata#8797) fix: remove Coordinate from __all__ in xarray/__init__.py (pydata#8791) Fix non-nanosecond casting behavior for `expand_dims` (pydata#8782) Migrate treenode module. (pydata#8757) ...
|
I'll merge tomorrow if there are no comments. |
* main: (26 commits) [pre-commit.ci] pre-commit autoupdate (pydata#8900) Bump the actions group with 1 update (pydata#8896) New empty whatsnew entry (pydata#8899) Update reference to 'Weighted quantile estimators' (pydata#8898) 2024.03.0: Add whats-new (pydata#8891) Add typing to test_groupby.py (pydata#8890) Avoid in-place multiplication of a large value to an array with small integer dtype (pydata#8867) Check for aligned chunks when writing to existing variables (pydata#8459) Add dt.date to plottable types (pydata#8873) Optimize writes to existing Zarr stores. (pydata#8875) Allow multidimensional variable with same name as dim when constructing dataset via coords (pydata#8886) Don't allow overwriting indexes with region writes (pydata#8877) Migrate datatree.py module into xarray.core. (pydata#8789) warn and return bytes undecoded in case of UnicodeDecodeError in h5netcdf-backend (pydata#8874) groupby: Dispatch quantile to flox. (pydata#8720) Opt out of auto creating index variables (pydata#8711) Update docs on view / copies (pydata#8744) Handle .oindex and .vindex for the PandasMultiIndexingAdapter and PandasIndexingAdapter (pydata#8869) numpy 2.0 copy-keyword and trapz vs trapezoid (pydata#8865) upstream-dev CI: Fix interp and cumtrapz (pydata#8861) ...
whats-new.rst@aulemahal would you be able to test against xclim's test suite. I imagine you're doing a bunch of grouped quantiles.