BUG: sparse.linalg: Cast index arrays to intc before calling SuperLU functions by perimosocordiae · Pull Request #18644 · scipy/scipy

perimosocordiae · 2023-06-07T13:11:17Z

Reference issue

What does this implement/fix?

The SuperLU code expects sparse matrix index arrays to have type intc, so this PR adds a cast prior to those calls. See the linked issue for context.

perimosocordiae · 2023-06-07T14:28:12Z

Test failure seems to be unrelated, only appearing on "Linux Meson tests / Python-debug & ATLAS":

FAILED ../linalg/tests/test_decomp_cossin.py::test_cossin_separate[float64] - AssertionError: 
Not equal to tolerance rtol=0, atol=2.22045e-15

Mismatched elements: 3897 / 6400 (60.9%)
Max absolute difference: 0.72137716
Max relative difference: 2.
 x: array([[ 0.107602,  0.008004,  0.163714, ..., -0.094327, -0.021723,
        -0.074046],
       [ 0.044608, -0.063854,  0.031058, ...,  0.085616,  0.000242,...
 y: array([[ 0.107602,  0.008004,  0.163714, ..., -0.094327,  0.021723,
        -0.074046],
       [ 0.044608, -0.063854,  0.031058, ...,  0.085616, -0.000242,...
= 1 failed, 42037 passed, 2802 skipped, 147 xfailed, 8 xpassed in 1030.29s (0:17:10) =

ilayn · 2023-06-07T15:50:51Z

For some reason that test is flaky on that run. I've restarted it. Typically it passes on the second run but I'll check if the lapack version is the issue

WarrenWeckesser · 2023-06-07T16:11:59Z

From the little that is visible in the error log, it looks like a column of one of the results might have a flipped sign. Is this another case of the solution not being unique?

ilayn · 2023-06-11T18:14:54Z

From the little that is visible in the error log, it looks like a column of one of the results might have a flipped sign. Is this another case of the solution not being unique?

I though the same but only this job is consistantly flaking. Hence I'm not sure what is happening there.

dschult · 2023-06-15T03:08:17Z

Looks like casting='safe' doesn't allow downcasting. I think we either want the default casting rule or to use casting=same_kind. (My reading of the astype docstring is that casting doesn't check values when casting. It only checks that the types are related according to the rule requested. And the casting='safe' rule doesn't allow int64 -> int32 because that might not preserve values. If we want to check the values to see whether preserving values is in danger, we'd have to do that ourselves.

I'd suggest we remove the casting='safe' argument in this PR (revert the latest commit?)

stefanv · 2023-06-15T03:29:36Z

Looks like casting='safe' doesn't allow downcasting. I think we either want the default casting rule or to use casting=same_kind. (My reading of the astype docstring is that casting doesn't check values when casting. It only checks that the types are related according to the rule requested. And the casting='safe' rule doesn't allow int64 -> int32 because that might not preserve values. If we want to check the values to see whether preserving values is in danger, we'd have to do that ourselves.

I'd suggest we remove the casting='safe' argument in this PR (revert the latest commit?)

The problem this is trying to address:

In [3]: np.int64(123123123123123123).astype(np.int32)
Out[3]: 23327667

Indeed, in my initial tests, I tested on scalars and not arrays, which behave differently. Scalar behavior, desired:

In [4]: np.int64(123123123123123123).astype(np.int32, casting='safe')
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[4], line 1
----> 1 np.int64(123123123123123123).astype(np.int32, casting='safe')

TypeError: Cannot cast scalar from dtype('int64') to dtype('int32') according to the rule 'safe'`

In [5]: np.int64(12312312).astype(np.int32, casting='safe')
Out[5]: 12312312

Array behavior, not desired:

In [9]: np.array([12312312,], dtype=np.int64).astype(np.int32, casting='safe')
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[9], line 1
----> 1 np.array([12312312,], dtype=np.int64).astype(np.int32, casting='safe')

TypeError: Cannot cast array data from dtype('int64') to dtype('int32') according to the rule 'safe'

I'm not sure what the idiomatic way is of spelling "cast when the values fit" as it does in the scalar case. Do we need an explicit value check? Something like:

ii = np.iinfo(np.int32)
np.all(x <= ii.max && x >= ii.min)

dschult · 2023-06-15T05:01:03Z

I think this PR isn't trying to fix the problem of downcasting. That problem already existed for this library. At least that is my understanding for this comment from #18603)
It "aggressively downcast" which I assume to mean downcast without checking.

I think this PR is trying to get back to the state before #18509 ... where the index is aggressively downcast so it can work with SPLU. That is, it is a quick fix. The slower fix would be to make SPLU work for int64. I guess the above suggestion about checking values before downcasting is a "middle-speed" fix.

What's the performance hit for

ii = np.iinfo(np.int32)
np.all(x <= ii.max && x >= ii.min)

Is it negligible compared to the SPLU computations?

Am I understanding the conversation correctly?

stefanv · 2023-06-15T06:08:11Z

My concern is that downcasting aggressively will return some result, but it won't be a correct result for the case where int64 indices were needed (i.e., large sparse matrix). SuperLU cannot handle those arrays, so we should warn the users. But it is probably true that this incorrect behavior was in SciPy for many versions.

This reverts commit 59e7578. Safe casting does not behave the way I thought, i.e. it does not take values into consideration. Reverting this patch will cast large indices to a wrapped around int32 version, which is incorrect, but which has also been the status quo for a long time. A future patch should put in place safeguards against incorrect index casting.

stefanv · 2023-06-15T06:13:15Z

I've reverted the patch anyway, since that was not the correct solution to the problem.

perimosocordiae · 2023-06-16T14:51:51Z

Using casting='same_kind' would probably be the right thing here, but we know that sparse containers will only have int32 or int64 index arrays so it's not really that useful in practice.

Re: the scalar vs array behavior difference, that looks related to numpy/numpy#8987

If we want to check for overflow explicitly, we could do some fast checks up front before the slow check:

indptr[-1] <= int32max because we know that indptr is always sorted
max(*shape) <= int32max as a cheap way to avoid checking the full indices array
indices.max() <= int32max only in cases where (1) was true and (2) was false

seberg · 2023-06-19T10:33:42Z

There is unfortunately no clear idiomatic way to spell "cast if fits". For advanced indexing (with arrays), NumPy even just uses "same kind" and gets away with it, although there we are talking about indices that are by definition impossibly large to be a valid index.

There was talk about adding an asindex() or so ufunc for the spefic purprose of doing this type of casts (I think also allowing floats if they are exact integers). I don't mind doing that, but I am not sure if that helps you.
In some cases np.int32(int(...)) might actually be a valid work-around (or operator.index() first, since int() is agressive).

Whether (or how) to apply cast safety especially to Python scalars is something I haven't quite figured out to be honest. (I am leaning to just considering them "safe", but raising anyway when it doesn't fit; of course the NumPy integers overflow on the same thing though :()

Sorry for probably too much details. Introducing a cast mode (or changing "same kind") that checks values may be a way forward. That does require explicit loops for that purprose, although unless we wish to support float -> int in this manner as well (which pandas might like I think), that shouldn't be super hard.

manually check for safe downcasting

tupui · 2023-06-26T17:56:15Z

AFAIU this would fix some new bugs in 1.11? I put the backport-candidate label so this can get in as soon as possible.

perimosocordiae · 2023-06-28T01:19:08Z

I just pushed a fix commit. Pending CI results, I think this should now be in good shape.

Note: The CircleCI build_scipy action failed with a suspicious message: no such option: --install-option (for the pip command).

tupui · 2023-06-28T05:03:20Z

For the issue in the CI, you need to merge main into your PR. CircleCI is not good with that 😅

dschult · 2023-06-28T13:09:31Z

Then it looks like the test failures are not indicating issues with this PR.
I agree with @perimosocordiae that this is in good shape.

tylerjereddy

It sounds like folks agree this should go in and get backported. The CI failures don't look related and when I rebase this branch locally on main it passes the full testsuite on x86_64 Linux.

Due to these bugs scipy/scipy#18644, scipy/scipy#18603

* Add current_through_paths * Require scipy<1.11 Due to these bugs scipy/scipy#18644, scipy/scipy#18603 * Update test_solution.py * Run notebooks

BUG: sparse.linalg: Cast to intc before SuperLU

df69cf4

github-actions Bot added scipy.sparse scipy.sparse.linalg labels Jun 7, 2023

perimosocordiae added defect A clear bug or issue that prevents SciPy from being installed or used as expected scipy.sparse and removed scipy.sparse labels Jun 7, 2023

stefanv reviewed Jun 7, 2023

View reviewed changes

Comment thread scipy/sparse/linalg/_dsolve/linsolve.py

Use casting='safe' for index array casts

59e7578

manually check for safe downcasting

d77c03c

perimosocordiae mentioned this pull request Jun 16, 2023

manually check for safe downcasting perimosocordiae/scipy#8

Merged

help func to check safe downcast using quick-checks when possible

a71bbc3

Merge pull request #8 from dschult/linsolve-intc

6916b63

manually check for safe downcasting

dschult mentioned this pull request Jun 26, 2023

BUG: test failures in NetworkX related to sparse.linalg.splu #18752

Closed

rossbar reviewed Jun 26, 2023

View reviewed changes

Comment thread scipy/sparse/linalg/_dsolve/linsolve.py Outdated

tupui added the backport-candidate This fix should be ported by a maintainer to previous SciPy versions. label Jun 26, 2023

tupui added this to the 1.11.1 milestone Jun 26, 2023

tupui linked an issue Jun 26, 2023 that may be closed by this pull request

BUG: test failures in NetworkX related to sparse.linalg.splu #18752

Closed

Fix _safe_downcast_indices

829a498

rossbar mentioned this pull request Jun 28, 2023

MAINT: SciPy 1.11.1 backports #18779

Merged

tylerjereddy modified the milestones: 1.11.1, 1.11.2 Jun 28, 2023

tylerjereddy approved these changes Jul 2, 2023

View reviewed changes

tylerjereddy merged commit 8501b7c into scipy:main Jul 2, 2023

perimosocordiae mentioned this pull request Jul 7, 2023

Buffer dtype mismatch, expected 'ITYPE_t' but got 'long' #18716

Closed

loganbvh added a commit to loganbvh/py-tdgl that referenced this pull request Jul 19, 2023

Require scipy<1.11

6cec6e1

Due to these bugs scipy/scipy#18644, scipy/scipy#18603

loganbvh added a commit to loganbvh/py-tdgl that referenced this pull request Jul 20, 2023

Add current_through_paths (#24)

bcea2ff

* Add current_through_paths * Require scipy<1.11 Due to these bugs scipy/scipy#18644, scipy/scipy#18603 * Update test_solution.py * Run notebooks

tylerjereddy mentioned this pull request Aug 10, 2023

MAINT: prepare for SciPy 1.11.2 #19040

Merged

5 tasks

tylerjereddy removed the backport-candidate This fix should be ported by a maintainer to previous SciPy versions. label Aug 17, 2023

dschmitz89 mentioned this pull request Aug 19, 2023

ENH: update vendored SuperLU version #19092

Closed

lesteve mentioned this pull request Sep 15, 2023

TST Extend tests for scipy.sparse/*array in sklearn/manifold/tests/test_spectral_embedding scikit-learn/scikit-learn#27240

Merged

Uh oh!

Conversation

perimosocordiae commented Jun 7, 2023

Reference issue

What does this implement/fix?

Uh oh!

perimosocordiae commented Jun 7, 2023

Uh oh!

ilayn commented Jun 7, 2023

Uh oh!

WarrenWeckesser commented Jun 7, 2023

Uh oh!

Uh oh!

ilayn commented Jun 11, 2023

Uh oh!

dschult commented Jun 15, 2023

Uh oh!

stefanv commented Jun 15, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dschult commented Jun 15, 2023

Uh oh!

stefanv commented Jun 15, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

stefanv commented Jun 15, 2023

Uh oh!

perimosocordiae commented Jun 16, 2023

Uh oh!

seberg commented Jun 19, 2023

Uh oh!

Uh oh!

tupui commented Jun 26, 2023

Uh oh!

perimosocordiae commented Jun 28, 2023

Uh oh!

tupui commented Jun 28, 2023

Uh oh!

dschult commented Jun 28, 2023

Uh oh!

tylerjereddy left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

9 participants

stefanv commented Jun 15, 2023 •

edited

Loading

stefanv commented Jun 15, 2023 •

edited

Loading