Address latest pandas-related upstream test failures#9081
Address latest pandas-related upstream test failures#9081dcherian merged 8 commits intopydata:mainfrom
Conversation
ed23adc to
ad164b7
Compare
ad164b7 to
bd875d3
Compare
| (datetime(2000, 1, 1), has_pandas_3), | ||
| (np.array([datetime(2000, 1, 1)]), has_pandas_3), |
There was a problem hiding this comment.
With pandas 3, pd.Series(datetime.datetime(...)) will produce a Series with np.datetime64[us] values instead of np.datetime64[ns] values, so this conversion now warns.
| # create test index | ||
| dd = times.to_pydatetime() | ||
| reference_dates = [dd[0], dd[2]] | ||
| reference_dates = [times[0], times[2]] |
There was a problem hiding this comment.
As far as I can tell, whether reference_dates started as datetime.datetime objects or np.datetime64[ns] values was not material to this test, so I removed the conversion to datetime.datetime to avoid the conversion warning under pandas 3 (the times would previously get converted back to datetime64[ns] values in the DataArray constructor).
xarray/tests/test_dataarray.py
Outdated
| roundtripped = DataArray.from_dict(da.to_dict()) | ||
| with warnings.catch_warnings(): | ||
| warnings.filterwarnings("ignore", message="Converting non-nanosecond") | ||
| roundtripped = DataArray.from_dict(da.to_dict()) |
There was a problem hiding this comment.
da.to_dict() produces datetime.datetime objects, which under pandas 3 lead to a conversion warning in the DataArray constructor.
There was a problem hiding this comment.
if we have this pattern in multiple modules, it might be worth adding the code as a special context manager to xarray.tests.__init__. Something like this might work (I didn't check):
from contextlib import contextmanager
import warnings
@contextmanager
def ignore_warnings(category=None, pattern=None):
if category is None and pattern is None:
raise ValueError("need at least one of category and pattern")
try:
with warnings.catch_warnings():
warnings.filterwarnings("ignore", message=pattern, category=category)
yield
finally:
passThere was a problem hiding this comment.
Thanks—I ended up switching to marking these tests with @pytest.mark.filterwarnings("ignore:Converting non-nanosecond"), since that is a pattern we use elsewhere in the tests already.
| darray = DataArray(data, dims=["time"]) | ||
| darray.coords["time"] = np.array([datetime(2017, m, 1) for m in month]) | ||
| times = pd.date_range(start="2017-01-01", freq="ME", periods=12) | ||
| darray = DataArray(data, dims=["time"], coords=[times]) |
There was a problem hiding this comment.
Use of datetime.datetime objects was immaterial to this test, so we use pd.date_range to produce the dates instead to avoid the non-nanosecond conversion warning.
xarray/tests/test_variable.py
Outdated
| with warnings.catch_warnings(): | ||
| warnings.filterwarnings("ignore", message="Converting non-nanosecond") | ||
| expected = self.cls("t", dates) |
There was a problem hiding this comment.
This is needed since dates sometimes consists of datetime.datetime objects, which leads to a conversion warning under pandas 3.
|
|
||
| def test_roundtrip_numpy_datetime_data(self) -> None: | ||
| times = pd.to_datetime(["2000-01-01", "2000-01-02", "NaT"]) | ||
| times = pd.to_datetime(["2000-01-01", "2000-01-02", "NaT"], unit="ns") |
There was a problem hiding this comment.
pandas.to_datetime will infer the precision from the input in pandas 3, so we explicitly specify the desired precision now.
keewis
left a comment
There was a problem hiding this comment.
Thanks for the quick fixes! This looks good to me.
I know I have not been doing that either for the numpy>=2 changes, but I wonder if we should add a whats-new entry (internal changes)?
|
Thanks @spencerkclark |
* Address pandas-related upstream test failures * Address more warnings * Don't lose coverage for pandas < 3 * Address one more warning * Fix accidental change from MS to ME * Use datetime64[ns] arrays * Switch to @pytest.mark.filterwarnings
* don't remove `netcdf4` from the upstream-dev environment * also stop removing `h5py` and `hdf5` * hard-code the precision (I believe this was missed in #9081) * don't remove `h5py` either * use on-diks _FillValue as standrd expects, use view instead of cast to prevent OverflowError. * whats-new * unpin `numpy` * rework UnsignedCoder * add test * Update xarray/coding/variables.py Co-authored-by: Justus Magin <keewis@users.noreply.github.com> --------- Co-authored-by: Kai Mühlbauer <kai.muehlbauer@uni-bonn.de> Co-authored-by: Kai Mühlbauer <kmuehlbauer@wradlib.org> Co-authored-by: Deepak Cherian <dcherian@users.noreply.github.com>
This PR addresses the upstream failures described in #8844 (comment) with a few minor changes to ensure that, for the time being, nanosecond precision times continue to be used in xarray. These failures stem from pandas-dev/pandas#55901, which causes
pandas.to_datetimeto infer the precision to use based its input instead of always using nanosecond precision.See the review comments for an explanation of the changes.