Address latest pandas-related upstream test failures by spencerkclark · Pull Request #9081 · pydata/xarray

spencerkclark · 2024-06-09T16:36:55Z

This PR addresses the upstream failures described in #8844 (comment) with a few minor changes to ensure that, for the time being, nanosecond precision times continue to be used in xarray. These failures stem from pandas-dev/pandas#55901, which causes pandas.to_datetime to infer the precision to use based its input instead of always using nanosecond precision.

See the review comments for an explanation of the changes.

spencerkclark · 2024-06-09T17:49:42Z

xarray/tests/test_variable.py

+        (datetime(2000, 1, 1), has_pandas_3),
+        (np.array([datetime(2000, 1, 1)]), has_pandas_3),


With pandas 3, pd.Series(datetime.datetime(...)) will produce a Series with np.datetime64[us] values instead of np.datetime64[ns] values, so this conversion now warns.

spencerkclark · 2024-06-09T17:52:11Z

xarray/tests/test_groupby.py

    # create test index
-    dd = times.to_pydatetime()
-    reference_dates = [dd[0], dd[2]]
+    reference_dates = [times[0], times[2]]


As far as I can tell, whether reference_dates started as datetime.datetime objects or np.datetime64[ns] values was not material to this test, so I removed the conversion to datetime.datetime to avoid the conversion warning under pandas 3 (the times would previously get converted back to datetime64[ns] values in the DataArray constructor).

spencerkclark · 2024-06-09T17:55:26Z

xarray/tests/test_dataarray.py

-        roundtripped = DataArray.from_dict(da.to_dict())
+        with warnings.catch_warnings():
+            warnings.filterwarnings("ignore", message="Converting non-nanosecond")
+            roundtripped = DataArray.from_dict(da.to_dict())


da.to_dict() produces datetime.datetime objects, which under pandas 3 lead to a conversion warning in the DataArray constructor.

if we have this pattern in multiple modules, it might be worth adding the code as a special context manager to xarray.tests.__init__. Something like this might work (I didn't check):

from contextlib import contextmanager import warnings @contextmanager def ignore_warnings(category=None, pattern=None): if category is None and pattern is None: raise ValueError("need at least one of category and pattern") try: with warnings.catch_warnings(): warnings.filterwarnings("ignore", message=pattern, category=category) yield finally: pass

Thanks—I ended up switching to marking these tests with @pytest.mark.filterwarnings("ignore:Converting non-nanosecond"), since that is a pattern we use elsewhere in the tests already.

spencerkclark · 2024-06-09T17:56:59Z

xarray/tests/test_plot.py

-        darray = DataArray(data, dims=["time"])
-        darray.coords["time"] = np.array([datetime(2017, m, 1) for m in month])
+        times = pd.date_range(start="2017-01-01", freq="ME", periods=12)
+        darray = DataArray(data, dims=["time"], coords=[times])


Use of datetime.datetime objects was immaterial to this test, so we use pd.date_range to produce the dates instead to avoid the non-nanosecond conversion warning.

spencerkclark · 2024-06-09T17:58:22Z

xarray/tests/test_variable.py

+            with warnings.catch_warnings():
+                warnings.filterwarnings("ignore", message="Converting non-nanosecond")
+                expected = self.cls("t", dates)


This is needed since dates sometimes consists of datetime.datetime objects, which leads to a conversion warning under pandas 3.

spencerkclark · 2024-06-09T18:00:43Z

xarray/tests/test_backends.py


    def test_roundtrip_numpy_datetime_data(self) -> None:
-        times = pd.to_datetime(["2000-01-01", "2000-01-02", "NaT"])
+        times = pd.to_datetime(["2000-01-01", "2000-01-02", "NaT"], unit="ns")


pandas.to_datetime will infer the precision from the input in pandas 3, so we explicitly specify the desired precision now.

xarray/tests/test_combine.py

keewis

Thanks for the quick fixes! This looks good to me.

I know I have not been doing that either for the numpy>=2 changes, but I wonder if we should add a whats-new entry (internal changes)?

dcherian · 2024-06-10T15:49:37Z

Thanks @spencerkclark

* Address pandas-related upstream test failures * Address more warnings * Don't lose coverage for pandas < 3 * Address one more warning * Fix accidental change from MS to ME * Use datetime64[ns] arrays * Switch to @pytest.mark.filterwarnings

* don't remove `netcdf4` from the upstream-dev environment * also stop removing `h5py` and `hdf5` * hard-code the precision (I believe this was missed in #9081) * don't remove `h5py` either * use on-diks _FillValue as standrd expects, use view instead of cast to prevent OverflowError. * whats-new * unpin `numpy` * rework UnsignedCoder * add test * Update xarray/coding/variables.py Co-authored-by: Justus Magin <keewis@users.noreply.github.com> --------- Co-authored-by: Kai Mühlbauer <kai.muehlbauer@uni-bonn.de> Co-authored-by: Kai Mühlbauer <kmuehlbauer@wradlib.org> Co-authored-by: Deepak Cherian <dcherian@users.noreply.github.com>

Address pandas-related upstream test failures

891fd6e

spencerkclark added the run-upstream Run upstream CI label Jun 9, 2024

spencerkclark force-pushed the upstream-failures-2024-06-09 branch 2 times, most recently from ed23adc to ad164b7 Compare June 9, 2024 17:30

Address more warnings

bd875d3

spencerkclark force-pushed the upstream-failures-2024-06-09 branch from ad164b7 to bd875d3 Compare June 9, 2024 17:39

spencerkclark added 2 commits June 9, 2024 13:48

Don't lose coverage for pandas < 3

616c179

Address one more warning

1a3bdf6

spencerkclark commented Jun 9, 2024

View reviewed changes

Fix accidental change from MS to ME

334d118

keewis mentioned this pull request Jun 10, 2024

Fix upcasting with python builtin numbers and numpy 2 #8946

Merged

4 tasks

spencerkclark added 3 commits June 10, 2024 07:39

Use datetime64[ns] arrays

36a005a

Switch to @pytest.mark.filterwarnings

85c95a1

Merge branch 'main' into upstream-failures-2024-06-09

d181b97

keewis approved these changes Jun 10, 2024

View reviewed changes

dcherian merged commit ef709df into pydata:main Jun 10, 2024

spencerkclark deleted the upstream-failures-2024-06-09 branch June 11, 2024 01:10

This was referenced Jun 12, 2024

⚠️ Nightly upstream-dev CI failed ⚠️ #9098

Closed

release notes for 2024.06.0 #9092

Merged

keewis added a commit to keewis/xarray that referenced this pull request Jun 19, 2024

hard-code the precision (I believe this was missed in pydata#9081)

429f87d

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Address latest pandas-related upstream test failures#9081

Address latest pandas-related upstream test failures#9081
dcherian merged 8 commits intopydata:mainfrom
spencerkclark:upstream-failures-2024-06-09

spencerkclark commented Jun 9, 2024 •

edited

Loading

Uh oh!

spencerkclark Jun 9, 2024

Uh oh!

spencerkclark Jun 9, 2024

Uh oh!

spencerkclark Jun 9, 2024

Uh oh!

keewis Jun 10, 2024

Uh oh!

spencerkclark Jun 10, 2024

Uh oh!

spencerkclark Jun 9, 2024

Uh oh!

spencerkclark Jun 9, 2024

Uh oh!

spencerkclark Jun 9, 2024

Uh oh!

Uh oh!

keewis left a comment

Uh oh!

dcherian commented Jun 10, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		(datetime(2000, 1, 1), has_pandas_3),
		(np.array([datetime(2000, 1, 1)]), has_pandas_3),

Uh oh!

Conversation

spencerkclark commented Jun 9, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

keewis left a comment

Choose a reason for hiding this comment

Uh oh!

dcherian commented Jun 10, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

spencerkclark commented Jun 9, 2024 •

edited

Loading