Skip to content

BUG: Fix np.einsum errors on Power9 Linux and z/Linux#14693

Merged
mattip merged 7 commits intonumpy:masterfrom
jwoehr:einsum_c_buglet
Oct 20, 2019
Merged

BUG: Fix np.einsum errors on Power9 Linux and z/Linux#14693
mattip merged 7 commits intonumpy:masterfrom
jwoehr:einsum_c_buglet

Conversation

@jwoehr
Copy link
Copy Markdown
Contributor

@jwoehr jwoehr commented Oct 14, 2019

BUG: fixes #14692 on Power 9 and z/Linux

@seberg
Copy link
Copy Markdown
Member

seberg commented Oct 14, 2019

So clang-tidy warns about: out_labels[ndim++] = label; (line 1946) and i = out_label - output_labels; (line 2055) and axes[i] = match - labels; (line 2232) (narrowing casts). I have no idea if those are actually particularly problematic, but thought I would mention.

@jwoehr
Copy link
Copy Markdown
Contributor Author

jwoehr commented Oct 14, 2019

So clang-tidy warns about: out_labels[ndim++] = label; (line 1946) and i = out_label - output_labels; (line 2055) and axes[i] = match - labels; (line 2232) (narrowing casts). I have no idea if those are actually particularly problematic, but thought I would mention.

I will explore this tomorrow. Thank you for all your help!

@seberg seberg changed the title fixes #14692 np.einsum errors on Power9 Linux and z/Linux BUG: Fix np.einsum errors on Power9 Linux and z/Linux Oct 15, 2019
@mattip
Copy link
Copy Markdown
Member

mattip commented Oct 19, 2019

Every fix should have a test case. I am not sure we need 4d4cc4d, can you add a test that fails before and passes after to prove it is required? The fix in 68861de is definitely needed for np.einsum('abab', x).

* need it to be signed here.
*/
label = (signed char)labels[idim];
label = labels[idim];
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this change?

* need it to be signed here.
*/
int label = (signed char)labels[idim];
int label = labels[idim];
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this change?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mattip we did that because it made it work on s390x and power9.
Of course it broke x86_64!
I thought I had reverted to a clean copy of master and made your change.
Now I think I made a MisTeAk :)
I will look at what I did and push again.

@mattip
Copy link
Copy Markdown
Member

mattip commented Oct 19, 2019

The test should be added to numpy/core/tests/test_einsum.py, maybe next to test_einsum_fixed_collapsingbug. Note that random.rand(10, 10, 10, 10) should be spelled random.random_sample((10, 10, 10, 10)). The test should mention the issues gh-14692 and gh-12689.

@jwoehr
Copy link
Copy Markdown
Contributor Author

jwoehr commented Oct 19, 2019

The test should be added to numpy/core/tests/test_einsum.py, maybe next to test_einsum_fixed_collapsingbug. Note that random.rand(10, 10, 10, 10) should be spelled random.random_sample((10, 10, 10, 10)). The test should mention the issues gh-14692 and gh-12689.

If it passes testing, I'll add that, thanks, @mattip

@jwoehr
Copy link
Copy Markdown
Contributor Author

jwoehr commented Oct 19, 2019

Note that random.rand(10, 10, 10, 10) should be spelled random.random_sample((10, 10, 10, 10)).

Hmm, this does not work:

tensor = np.random.random_sample(10, 10, 10, 10)
  File "mtrand.pyx", line 370, in numpy.random.mtrand.RandomState.random_sample
TypeError: random_sample() takes at most 1 positional argument (4 given)

@mattip
Copy link
Copy Markdown
Member

mattip commented Oct 19, 2019

Are you sure you used my code? It looks different to me.

@mattip
Copy link
Copy Markdown
Member

mattip commented Oct 19, 2019

The documented developer workflow is to add tests as part of the development process. I find it helpful to:

  • reproduce a failure locally from the CLI with python runtests.py --python which will compile and use a HEAD version of numpy
  • add a test that fails when run locally with python runtests.py -t path/to/changed-file
  • add a fix and verify that the test suite runs locally with python runtests.py
  • push the test and the fix together.

This makes sure your fix is correct and prevents unneeded CI runs. In this case, I needed to dive in with gdb to find out exactly what was going wrong so a test was critical to find the missing cast.

@jwoehr
Copy link
Copy Markdown
Contributor Author

jwoehr commented Oct 19, 2019

Are you sure you used my code? It looks different to me.

aha :) Lots of parentheses

@jwoehr
Copy link
Copy Markdown
Contributor Author

jwoehr commented Oct 19, 2019

w/r/t the test case, does it have to test some kind of result, or is it enough that the failure is that an exception is thrown?

@charris charris added 06 - Regression 09 - Backport-Candidate PRs tagged should be backported labels Oct 19, 2019
@charris charris added this to the 1.17.4 release. milestone Oct 19, 2019
# Bug with signed vs unsigned char errored on power9 and s390x Linux
tensor = np.random.random_sample((10, 10, 10, 10))
print(np.einsum('ijij->', tensor))

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe (please check locally before pushing)

x = np.einsum('ijij->', tensor)
y = tensor.trace(axis1=0, axis2=2).trace()
assert_equal(x, y)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With floating point, you can't be sure that x and y will be exactly equal. Use assert_allclose with an appropriately small tolerance, or use an integer array for tensor (assuming the test doesn't actually depend on the data type of the array).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done; tested on x86_64, power9, and s390x; and pushed.

@mattip mattip merged commit 9c8e904 into numpy:master Oct 20, 2019
@mattip
Copy link
Copy Markdown
Member

mattip commented Oct 20, 2019

Thanks @jwoehr

@jwoehr jwoehr deleted the einsum_c_buglet branch October 20, 2019 04:08
charris added a commit to charris/numpy that referenced this pull request Nov 8, 2019
charris added a commit to charris/numpy that referenced this pull request Nov 8, 2019
@charris charris removed the 09 - Backport-Candidate PRs tagged should be backported label Nov 8, 2019
@charris charris removed this from the 1.17.4 release. milestone Nov 8, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

BUG: Fix np.einsum errors on Power9 Linux and z/Linux.

5 participants