Ignore deprecationwarning on np.fix#12248
Merged
TomAugspurger merged 1 commit intodask:mainfrom Jan 23, 2026
Merged
Conversation
`np.fix` is deprecated in favor of `truncate` in numpy dev, which caused this CI failure: https://github.com/dask/dask/actions/runs/21265711665/job/61204253570#step:6:37038 This PR ignores that warning. We want to keep testing `fix` until it's removed, but we don't care that it's deprecated.
Contributor
Unit Test ResultsSee test report for an extended history of previous test failures. This is useful for diagnosing flaky tests. 0 tests ±0 0 ✅ ±0 0s ⏱️ ±0s Results for commit 5c91b97. ± Comparison against base commit 80a90b2. |
Member
Author
|
The CI failure here is from an unrelated and seemingly flaky _____________ test_roundtrip_partitioned_pyarrow_dataset[pyarrow] ______________
[gw0] linux -- Python 3.11.14 /home/runner/miniconda3/envs/test-environment/bin/python
self = Index([1, 2], dtype='int32'), key = 2
def get_loc(self, key):
"""
Get integer location, slice or boolean mask for requested label.
Parameters
----------
key : label
The key to check its location if it is present in the index.
Returns
-------
int if unique index, slice if monotonic index, else mask
Integer location, slice or boolean mask.
See Also
--------
Index.get_slice_bound : Calculate slice bound that corresponds to
given label.
Index.get_indexer : Computes indexer and mask for new index given
the current index.
Index.get_non_unique : Returns indexer and masks for new index given
the current index.
Index.get_indexer_for : Returns an indexer even when non-unique.
Examples
--------
>>> unique_index = pd.Index(list("abc"))
>>> unique_index.get_loc("b")
1
>>> monotonic_index = pd.Index(list("abbc"))
>>> monotonic_index.get_loc("b")
slice(1, 3, None)
>>> non_monotonic_index = pd.Index(list("abcb"))
>>> non_monotonic_index.get_loc("b")
array([False, True, False, True])
"""
casted_key = self._maybe_cast_indexer(key)
try:
> return self._engine.get_loc(casted_key)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
../../../miniconda3/envs/test-environment/lib/python3.11/site-packages/pandas/core/indexes/base.py:3641:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
pandas/_libs/index.pyx:168: in pandas._libs.index.IndexEngine.get_loc
???
pandas/_libs/index.pyx:197: in pandas._libs.index.IndexEngine.get_loc
???
pandas/_libs/hashtable_class_helper.pxi:4823: in pandas._libs.hashtable.Int32HashTable.get_item
???
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
> ???
E KeyError: 2
pandas/_libs/hashtable_class_helper.pxi:4847: KeyError
The above exception was the direct cause of the following exception:
tmpdir = local('/tmp/pytest-of-runner/pytest-0/popen-gw0/test_roundtrip_partitioned_pya0')
engine = 'pyarrow'
@PYARROW_MARK
def test_roundtrip_partitioned_pyarrow_dataset(tmpdir, engine):
# See: https://github.com/dask/dask/issues/8650
import pyarrow.parquet as pq
from pyarrow.dataset import HivePartitioning, write_dataset
# Sample data
df = pd.DataFrame({"col1": [1, 2], "col2": ["a", "b"]})
# Write partitioned dataset with dask
dask_path = tmpdir.mkdir("foo-dask")
ddf = dd.from_pandas(df, npartitions=2)
ddf.to_parquet(dask_path, engine=engine, partition_on=["col1"], write_index=False)
# Write partitioned dataset with pyarrow
pa_path = tmpdir.mkdir("foo-pyarrow")
table = pa.Table.from_pandas(df)
write_dataset(
data=table,
base_dir=pa_path,
basename_template="part.{i}.parquet",
format="parquet",
partitioning=HivePartitioning(pa.schema([("col1", pa.int32())])),
)
# Define simple function to ensure results should
# be comparable (same column and row order)
def _prep(x):
return x.sort_values("col2")[["col1", "col2"]]
# Check that reading dask-written data is the same for pyarrow and dask
df_read_dask = dd.read_parquet(dask_path, engine=engine)
df_read_pa = pq.read_table(dask_path).to_pandas()
assert_eq(_prep(df_read_dask), _prep(df_read_pa), check_index=False)
# Check that reading pyarrow-written data is the same for pyarrow and dask
df_read_dask = dd.read_parquet(pa_path, engine=engine)
df_read_pa = pq.read_table(pa_path).to_pandas()
> assert_eq(_prep(df_read_dask), _prep(df_read_pa), check_index=False)
dask/dataframe/io/tests/test_parquet.py:3734:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
dask/dataframe/utils.py:540: in assert_eq
assert_divisions(a, scheduler=scheduler)
dask/dataframe/utils.py:599: in assert_divisions
if not hasattr(ddf, "divisions"):
^^^^^^^^^^^^^^^^^^^^^^^^^
dask/dataframe/dask_expr/_collection.py:353: in divisions
return self.expr.divisions
^^^^^^^^^^^^^^^^^^^
../../../miniconda3/envs/test-environment/lib/python3.11/functools.py:1001: in __get__
val = self.func(instance)
^^^^^^^^^^^^^^^^^^^
dask/dataframe/dask_expr/_expr.py:432: in divisions
return tuple(self._divisions())
^^^^^^^^^^^^^^^^^
dask/dataframe/dask_expr/_expr.py:2207: in _divisions
return super()._divisions()
^^^^^^^^^^^^^^^^^^^^
dask/dataframe/dask_expr/_expr.py:595: in _divisions
if not self._broadcast_dep(arg):
^^^^^^^^^^^^^^^^^^^^^^^^
dask/dataframe/dask_expr/_expr.py:586: in _broadcast_dep
return dep.npartitions == 1 and dep.ndim < self.ndim
^^^^^^^^^^^^^^^
dask/dataframe/dask_expr/_shuffle.py:803: in npartitions
return self.operand("npartitions") or len(self._divisions()) - 1
^^^^^^^^^^^^^^^^^
dask/dataframe/dask_expr/_shuffle.py:1008: in _divisions
divisions, mins, maxes, presorted = _get_divisions(
dask/dataframe/dask_expr/_shuffle.py:1333: in _get_divisions
result = _calculate_divisions(
dask/dataframe/dask_expr/_shuffle.py:1357: in _calculate_divisions
divisions, mins, maxes = compute(
dask/base.py:685: in compute
results = schedule(expr, keys, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
dask/dataframe/io/parquet/core.py:82: in __call__
return read_parquet_part(
dask/dataframe/io/parquet/core.py:183: in read_parquet_part
dfs = [
dask/dataframe/io/parquet/core.py:184: in <listcomp>
func(
dask/dataframe/io/parquet/arrow.py:572: in read_partition
arrow_table = cls._read_table(
dask/dataframe/io/parquet/arrow.py:1723: in _read_table
len(arrow_table), partition.keys.get_loc(cat), dtype="i4"
^^^^^^^^^^^^^^^^^^^^^^^^^^^
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = Index([1, 2], dtype='int32'), key = 2
def get_loc(self, key):
"""
Get integer location, slice or boolean mask for requested label.
Parameters
----------
key : label
The key to check its location if it is present in the index.
Returns
-------
int if unique index, slice if monotonic index, else mask
Integer location, slice or boolean mask.
See Also
--------
Index.get_slice_bound : Calculate slice bound that corresponds to
given label.
Index.get_indexer : Computes indexer and mask for new index given
the current index.
Index.get_non_unique : Returns indexer and masks for new index given
the current index.
Index.get_indexer_for : Returns an indexer even when non-unique.
Examples
--------
>>> unique_index = pd.Index(list("abc"))
>>> unique_index.get_loc("b")
1
>>> monotonic_index = pd.Index(list("abbc"))
>>> monotonic_index.get_loc("b")
slice(1, 3, None)
>>> non_monotonic_index = pd.Index(list("abcb"))
>>> non_monotonic_index.get_loc("b")
array([False, True, False, True])
"""
casted_key = self._maybe_cast_indexer(key)
try:
return self._engine.get_loc(casted_key)
except KeyError as err:
if isinstance(casted_key, slice) or (
isinstance(casted_key, abc.Iterable)
and any(isinstance(x, slice) for x in casted_key)
):
raise InvalidIndexError(key) from err
> raise KeyError(key) from err
E KeyError: 2
../../../miniconda3/envs/test-environment/lib/python3.11/site-packages/pandas/core/indexes/base.py:3648: KeyErrorIf anyone else sees it then LMK |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
np.fixis deprecated in favor oftruncatein numpy dev, which caused this CI failure: https://github.com/dask/dask/actions/runs/21265711665/job/61204253570#step:6:37038Closes #12245
This PR ignores that warning. We want to keep testing
fixuntil it's removed, but we don't care that it's deprecated.