Skip to content

Ignore deprecationwarning on np.fix#12248

Merged
TomAugspurger merged 1 commit intodask:mainfrom
TomAugspurger:tom/numpy-fix-depr
Jan 23, 2026
Merged

Ignore deprecationwarning on np.fix#12248
TomAugspurger merged 1 commit intodask:mainfrom
TomAugspurger:tom/numpy-fix-depr

Conversation

@TomAugspurger
Copy link
Copy Markdown
Member

@TomAugspurger TomAugspurger commented Jan 22, 2026

np.fix is deprecated in favor of truncate in numpy dev, which caused this CI failure: https://github.com/dask/dask/actions/runs/21265711665/job/61204253570#step:6:37038

Closes #12245

This PR ignores that warning. We want to keep testing fix until it's removed, but we don't care that it's deprecated.

`np.fix` is deprecated in favor of `truncate` in numpy dev, which
caused this CI failure: https://github.com/dask/dask/actions/runs/21265711665/job/61204253570#step:6:37038

This PR ignores that warning. We want to keep testing `fix` until it's removed, but we don't
care that it's deprecated.
@github-actions
Copy link
Copy Markdown
Contributor

Unit Test Results

See test report for an extended history of previous test failures. This is useful for diagnosing flaky tests.

0 tests  ±0   0 ✅ ±0   0s ⏱️ ±0s
0 suites ±0   0 💤 ±0 
0 files   ±0   0 ❌ ±0 

Results for commit 5c91b97. ± Comparison against base commit 80a90b2.

@TomAugspurger
Copy link
Copy Markdown
Member Author

The CI failure here is from an unrelated and seemingly flaky read_parquet test. I haven't been able to reproduce it yet:

_____________ test_roundtrip_partitioned_pyarrow_dataset[pyarrow] ______________
[gw0] linux -- Python 3.11.14 /home/runner/miniconda3/envs/test-environment/bin/python

self = Index([1, 2], dtype='int32'), key = 2

    def get_loc(self, key):
        """
        Get integer location, slice or boolean mask for requested label.
    
        Parameters
        ----------
        key : label
            The key to check its location if it is present in the index.
    
        Returns
        -------
        int if unique index, slice if monotonic index, else mask
            Integer location, slice or boolean mask.
    
        See Also
        --------
        Index.get_slice_bound : Calculate slice bound that corresponds to
            given label.
        Index.get_indexer : Computes indexer and mask for new index given
            the current index.
        Index.get_non_unique : Returns indexer and masks for new index given
            the current index.
        Index.get_indexer_for : Returns an indexer even when non-unique.
    
        Examples
        --------
        >>> unique_index = pd.Index(list("abc"))
        >>> unique_index.get_loc("b")
        1
    
        >>> monotonic_index = pd.Index(list("abbc"))
        >>> monotonic_index.get_loc("b")
        slice(1, 3, None)
    
        >>> non_monotonic_index = pd.Index(list("abcb"))
        >>> non_monotonic_index.get_loc("b")
        array([False,  True, False,  True])
        """
        casted_key = self._maybe_cast_indexer(key)
        try:
>           return self._engine.get_loc(casted_key)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

../../../miniconda3/envs/test-environment/lib/python3.11/site-packages/pandas/core/indexes/base.py:3641: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
pandas/_libs/index.pyx:168: in pandas._libs.index.IndexEngine.get_loc
    ???
pandas/_libs/index.pyx:197: in pandas._libs.index.IndexEngine.get_loc
    ???
pandas/_libs/hashtable_class_helper.pxi:4823: in pandas._libs.hashtable.Int32HashTable.get_item
    ???
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

>   ???
E   KeyError: 2

pandas/_libs/hashtable_class_helper.pxi:4847: KeyError

The above exception was the direct cause of the following exception:

tmpdir = local('/tmp/pytest-of-runner/pytest-0/popen-gw0/test_roundtrip_partitioned_pya0')
engine = 'pyarrow'

    @PYARROW_MARK
    def test_roundtrip_partitioned_pyarrow_dataset(tmpdir, engine):
        # See: https://github.com/dask/dask/issues/8650
    
        import pyarrow.parquet as pq
        from pyarrow.dataset import HivePartitioning, write_dataset
    
        # Sample data
        df = pd.DataFrame({"col1": [1, 2], "col2": ["a", "b"]})
    
        # Write partitioned dataset with dask
        dask_path = tmpdir.mkdir("foo-dask")
        ddf = dd.from_pandas(df, npartitions=2)
        ddf.to_parquet(dask_path, engine=engine, partition_on=["col1"], write_index=False)
    
        # Write partitioned dataset with pyarrow
        pa_path = tmpdir.mkdir("foo-pyarrow")
        table = pa.Table.from_pandas(df)
        write_dataset(
            data=table,
            base_dir=pa_path,
            basename_template="part.{i}.parquet",
            format="parquet",
            partitioning=HivePartitioning(pa.schema([("col1", pa.int32())])),
        )
    
        # Define simple function to ensure results should
        # be comparable (same column and row order)
        def _prep(x):
            return x.sort_values("col2")[["col1", "col2"]]
    
        # Check that reading dask-written data is the same for pyarrow and dask
        df_read_dask = dd.read_parquet(dask_path, engine=engine)
        df_read_pa = pq.read_table(dask_path).to_pandas()
        assert_eq(_prep(df_read_dask), _prep(df_read_pa), check_index=False)
    
        # Check that reading pyarrow-written data is the same for pyarrow and dask
        df_read_dask = dd.read_parquet(pa_path, engine=engine)
        df_read_pa = pq.read_table(pa_path).to_pandas()
>       assert_eq(_prep(df_read_dask), _prep(df_read_pa), check_index=False)

dask/dataframe/io/tests/test_parquet.py:3734: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
dask/dataframe/utils.py:540: in assert_eq
    assert_divisions(a, scheduler=scheduler)
dask/dataframe/utils.py:599: in assert_divisions
    if not hasattr(ddf, "divisions"):
           ^^^^^^^^^^^^^^^^^^^^^^^^^
dask/dataframe/dask_expr/_collection.py:353: in divisions
    return self.expr.divisions
           ^^^^^^^^^^^^^^^^^^^
../../../miniconda3/envs/test-environment/lib/python3.11/functools.py:1001: in __get__
    val = self.func(instance)
          ^^^^^^^^^^^^^^^^^^^
dask/dataframe/dask_expr/_expr.py:432: in divisions
    return tuple(self._divisions())
                 ^^^^^^^^^^^^^^^^^
dask/dataframe/dask_expr/_expr.py:2207: in _divisions
    return super()._divisions()
           ^^^^^^^^^^^^^^^^^^^^
dask/dataframe/dask_expr/_expr.py:595: in _divisions
    if not self._broadcast_dep(arg):
           ^^^^^^^^^^^^^^^^^^^^^^^^
dask/dataframe/dask_expr/_expr.py:586: in _broadcast_dep
    return dep.npartitions == 1 and dep.ndim < self.ndim
           ^^^^^^^^^^^^^^^
dask/dataframe/dask_expr/_shuffle.py:803: in npartitions
    return self.operand("npartitions") or len(self._divisions()) - 1
                                              ^^^^^^^^^^^^^^^^^
dask/dataframe/dask_expr/_shuffle.py:1008: in _divisions
    divisions, mins, maxes, presorted = _get_divisions(
dask/dataframe/dask_expr/_shuffle.py:1333: in _get_divisions
    result = _calculate_divisions(
dask/dataframe/dask_expr/_shuffle.py:1357: in _calculate_divisions
    divisions, mins, maxes = compute(
dask/base.py:685: in compute
    results = schedule(expr, keys, **kwargs)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
dask/dataframe/io/parquet/core.py:82: in __call__
    return read_parquet_part(
dask/dataframe/io/parquet/core.py:183: in read_parquet_part
    dfs = [
dask/dataframe/io/parquet/core.py:184: in <listcomp>
    func(
dask/dataframe/io/parquet/arrow.py:572: in read_partition
    arrow_table = cls._read_table(
dask/dataframe/io/parquet/arrow.py:1723: in _read_table
    len(arrow_table), partition.keys.get_loc(cat), dtype="i4"
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = Index([1, 2], dtype='int32'), key = 2

    def get_loc(self, key):
        """
        Get integer location, slice or boolean mask for requested label.
    
        Parameters
        ----------
        key : label
            The key to check its location if it is present in the index.
    
        Returns
        -------
        int if unique index, slice if monotonic index, else mask
            Integer location, slice or boolean mask.
    
        See Also
        --------
        Index.get_slice_bound : Calculate slice bound that corresponds to
            given label.
        Index.get_indexer : Computes indexer and mask for new index given
            the current index.
        Index.get_non_unique : Returns indexer and masks for new index given
            the current index.
        Index.get_indexer_for : Returns an indexer even when non-unique.
    
        Examples
        --------
        >>> unique_index = pd.Index(list("abc"))
        >>> unique_index.get_loc("b")
        1
    
        >>> monotonic_index = pd.Index(list("abbc"))
        >>> monotonic_index.get_loc("b")
        slice(1, 3, None)
    
        >>> non_monotonic_index = pd.Index(list("abcb"))
        >>> non_monotonic_index.get_loc("b")
        array([False,  True, False,  True])
        """
        casted_key = self._maybe_cast_indexer(key)
        try:
            return self._engine.get_loc(casted_key)
        except KeyError as err:
            if isinstance(casted_key, slice) or (
                isinstance(casted_key, abc.Iterable)
                and any(isinstance(x, slice) for x in casted_key)
            ):
                raise InvalidIndexError(key) from err
>           raise KeyError(key) from err
E           KeyError: 2

../../../miniconda3/envs/test-environment/lib/python3.11/site-packages/pandas/core/indexes/base.py:3648: KeyError

If anyone else sees it then LMK

@TomAugspurger TomAugspurger merged commit 6dff73f into dask:main Jan 23, 2026
26 of 27 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

⚠️ Upstream CI failed ⚠️

1 participant