BUG: Correct loop order in MT19937 jump by bashtage · Pull Request #16153 · numpy/numpy

bashtage · 2020-05-04T17:00:09Z

Use the original loop order instead of an inverted order

seberg · 2020-05-04T17:18:18Z

Thanks Kevin! @fzeiser and @dubzzz could you have a quick look whether this makes sense to you, and gives the expected jump values? Not that I don't trust that, but always good to have a second pair of eyes, and you are more knowledgeable.

@bashtage I guess this should have a release note due to the changed stream if jump is used?

bashtage · 2020-05-04T17:25:26Z

If anyone is interested, the code to produce the test values is in this commit.

dubzzz · 2020-05-04T17:52:25Z

numpy/random/tests/test_generator_mt19937.py

I'd probably add another test covering the issue I originally opened:

{ # You already have this one "seed": 0, "steps": 10, "initial": {"key_md5": "64eaf265d2203179fb5ffb73380cd589", "pos": 9}, "jumped": {"key_md5": "8cb7b061136efceef5217a9ce2cc9a5a", "pos": 598}, }, { # But don't have this one "seed": 0, "steps": 11, "initial": {"key_md5": "64eaf265d2203179fb5ffb73380cd589", "pos": 10}, "jumped": {"key_md5": "8cb7b061136efceef5217a9ce2cc9a5a", "pos": 599}, },

The added test ensures that we have the following behaviour:
jump(next(mt19937)) == next(jump(mt19937)) - or something close

The only guarantee that I think we should try and make is that the code is faithful to the original author's code. Whether this is true or not I think is a different issue.

If I understand it correctly, the test you propose will not work (in the current implementation / original authors source code) / tried it out and it didn't work:

jump(next(mt19937)) == next(jump(mt19937)) - or something close

However, if this is the case, then I don't think that this pull request closes #15394, does it?
Or put it the other way around: With the current state of the PR, shouldn't the documentation be adopted too?

Current documentation:

The state of the returned big generator is jumped as-if 2**(128 * jumps) random numbers have been generated.

On this note:

The only guarantee that I think we should try and make is that the code is faithful to the original author's code. Whether this is true or not I think is a different issue.

To what extend is this the policy? I totally understand that you may not have the time to check that all implementation are "true" - but still I think it's worrisome if the random generator behaves in a way we don't understand. So what should the documentation of jump say? If I exaggerate a little: the original author claims that it is as if 2*(128 * jumps) random numbers have been generated; but when we test it, that it not true`

Another note: Shall we implement the same type of test

jump(next(mt19937)) == next(jump(mt19937)) - or something close

in the tests for PCG64? There it should pass; but nice to have this ensured for the future, too.

Depends on what the point of #15394 is. If it is to check the implementation, then I think this PR closes that. If it is to make jumped work in a way different from the original code, then it does not. I don't think the latter is realistic.

My original point when opening the issue was the following:

As a user of the library I saw that the jump was supposed to offset by a fixed number of iterations. So my expectation and the reason why I opened the issue was jump(next(mt19937)) should always be the same as next(jump(mt19937)) - or at least we might somehow be able to find the same sequence close enough as we should jump very close in both cases. If it is not the case it seems problematic as it looks like the 2**128 skipped iterations are not true so I cannot safely rely on a jump if we don't know where it jumps.

Actually I found this issue while implementing a jump for mersenne twister on a JavaScript library I am maintaining. I already had a working implementation of the mersenne generator but needed a reference one for the jump to compare my results to it. So I compared it to the jump of numpy. In the set of tests, I use on my generators one of them - based on property based testing (see hypothesis for Python) - checks that: jump(next(generator)) == next(jump(generator)). And here is how I found the problem.

dubzzz · 2020-05-04T18:09:23Z

numpy/random/src/mt19937/mt19937-jump.c

🤔

In this case I'd maybe call mt19937_gen before jumping. As in https://github.com/numpy/numpy/blob/28f9cbc5e5510cd780b9d6a18a8e8bc6b45a13d7/numpy/random/src/mt19937/mt19937.h#L27-L43

Reseting pos to 0 when pos is N means that the final generator after jump will be: 0 + 2**128
While if pos was N-2 it would be: N - 2 + 2**128
While if pos was N-1 it would be: N - 1 + 2**128

I think the best we can do here is to have fidelity with the original jump implementation. Going off in other directions seems unnecessarily risky.

I think I agree, except some ease of mind, I honestly do not know why there would be a use-case to generate the same identical stream after different order of operation? The only thing is could be to document it, but since this is the reference implementation...

I totally agree that if the problem comes from the reference implementation then it is another issue.

What I was saying is more that given the documentation of this method, as a user I expect that calling jump will offset the generator by 2**128 iterations which should be the same as calling next 2**128 times in a row. Saying that next(jump) !== jump(next) somehow points that this is not the case 🤔

@bashtage maybe we can just tweak the docs a little to be clear that it only jumps about 2**128 states, 2**128 states from the 0 offset postion (whatever it is)?

People still need a mental model of what's going on. From what you've added, I'd still expect next(jumped()) == jumped(next()) to hold true. If that's not true, then what are the guarantees that we can rely on to reason about our programs?

My guess is that the jump is for the whole state and ignores the offset for the next item. @bashtage Is that the case? If so, it should be possible to work around that.

I'm going to try and contact the author to see if he has any notes to share

@bashtage Did you succeed into getting in touch with the author of the paper? I'd really be interested to know the reason why jump seems to jump at very different locations based on how many times we offset the rng before jumping 🤔

As a reminder my original issue was:

from numpy.random import MT19937 rng = MT19937(42) rng.random_raw() # value given state offset of 0 rng.random_raw() # value given state offset of 1 rng = rng.jumped() rng.random_raw() # value given state offset of 2**128 +2 # Got: 2382575482 rng = MT19937(42) rng.random_raw() # value given state offset of 0 rng = rng.jumped() rng.random_raw() # value given state offset of 2**128 +1 rng.random_raw() # value given state offset of 2**128 +2 # Got: 2001582899

It seems that the current version of numpy (1.19.4) still has a similiar behaviour. Which seems unexpected given what jump should do 🤔

@bashtage Just bringing up the point of dubzzz again: Did you hear back anything of the original author of the jump code paper?

I recently revisited this issue on pure-rand side with the help of Claude. It is still a work-in-progress but it somehow inspired itself from the numpy implementation but made it (in JavaScript) with the jump>next is equivalent to next>jump. I'll will have to re-read it carefully to get why it suddenly worked but given the tests being executed it seems that it made it through.

PR here.

I'm gonna keep you posted when I understand the diff with its implementation and mine (which was translated from numpy)

fzeiser · 2020-05-04T18:33:41Z

numpy/random/src/mt19937/mt19937-poly.h

Makes sense and is easily readable! Do you think it makes sense to refer to the commit (5055c43) or a maybe rather version number (1.18.1 ?) of randomgen, too?

bashtage · 2020-05-04T21:35:22Z

Looks like the jumped code is LE only. BE will need (painful) investigation.

charris · 2020-05-04T22:43:01Z

Should we issue a warning/error on big endian machines until it is fixed?

bashtage · 2020-05-05T09:26:31Z

@charris The bug was in my use of hashlib, not in the C code. CI is passing now.

charris · 2020-05-12T00:54:45Z

CircleCI is complaining about a missing reference.

Use the original loop order instead of an inverted order closes numpy#15394

Refactor polynomial to be unsigned long array Remove unused code Fix md5 calculation on BE

Add note detailing the changes to MT19937 jumped

Remove unused file containing the old polynomial representation.

Clarify the method used and the source of the code

numpy/random/_mt19937.pyx

Fix indent on reference and remove text that may be incorrect.

charris · 2020-05-12T23:12:18Z

Passing now. I'll put this in tomorrow to allow others to comment if they wish.

charris · 2020-05-12T23:16:12Z

numpy/random/_mt19937.pyx

+        References
+        ----------
+        .. [1] Matsumoto, M, Generating multiple disjoint streams of
+           pseudorandom number sequences.  Accessed on: May 6, 2020.


I suspect that the the title should be in quotes, but I don't know if that makes a difference. I'm not familiar with the citation format we use. @mattip @rkern thoughts?

This may be related to regexp for renaming references in numpydoc... looking into it now

hmm doesn't seem like it, at least not a problem with the section I looked at:

https://github.com/numpy/numpydoc/blob/b4c5fd17e2b85c2416a5e586933eee353b58bf7c/numpydoc/numpydoc.py#L43-L67

charris · 2020-05-13T13:38:49Z

Thanks Kevin.

Implement MT19937 jump-ahead by 2^128 steps using Horner's method on the characteristic polynomial in GF(2). Key changes: - Switch next() from batch-twist to lazy gen_next model so that jump and next use the same recurrence, guaranteeing commutativity - Add 624-element jump polynomial (2^128 step) from Hiroshima University's jump-ahead algorithm - Implement jump() via Horner's method with genNext/addState helpers - Add reference value tests and property-based tests (noOrderNextJump, changeSelfWithJump, noChangeOnClonedWithJump) References: - Hiroshima University MT jump: http://www.math.sci.hiroshima-u.ac.jp/m-mat/MT/JUMP/ - numpy MT19937 jump (corrected): numpy/numpy#16153 - Previous attempt: #12 https://claude.ai/code/session_01Jaz8u7FqjJQUkipr8LPbfr

seberg added 00 - Bug 56 - Needs Release Note. Needs an entry in doc/release/upcoming_changes component: numpy.random labels May 4, 2020

dubzzz reviewed May 4, 2020

View reviewed changes

fzeiser reviewed May 4, 2020

View reviewed changes

seberg added this to the 1.19.0 release milestone May 4, 2020

seberg removed the 56 - Needs Release Note. Needs an entry in doc/release/upcoming_changes label May 5, 2020

charris closed this May 12, 2020

charris reopened this May 12, 2020

bashtage added 5 commits May 12, 2020 22:39

BUG: Correct loop order in MT19937 jump

3bc2c0d

Use the original loop order instead of an inverted order closes numpy#15394

REF: Refactor jump code

b79be48

Refactor polynomial to be unsigned long array Remove unused code Fix md5 calculation on BE

DOC: Add release note

344f0f2

Add note detailing the changes to MT19937 jumped

MAINT: Remove unused file

f7d740c

Remove unused file containing the old polynomial representation.

DOC: Improve the docstring for MT19937.jumped

95309c7

Clarify the method used and the source of the code

bashtage force-pushed the fix-mt19937-jump branch from 8f46630 to 95309c7 Compare May 12, 2020 21:39

charris reviewed May 12, 2020

View reviewed changes

numpy/random/_mt19937.pyx Show resolved Hide resolved

DOC: Update docstring

ac72f8d

Fix indent on reference and remove text that may be incorrect.

charris reviewed May 12, 2020

View reviewed changes

charris merged commit 77410e2 into numpy:master May 13, 2020

bashtage deleted the fix-mt19937-jump branch May 28, 2020 21:44

Uh oh!

Conversation

bashtage commented May 4, 2020

Uh oh!

seberg commented May 4, 2020

Uh oh!

bashtage commented May 4, 2020

Uh oh!

dubzzz May 4, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

fzeiser May 5, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dubzzz May 6, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bashtage commented May 4, 2020

Uh oh!

charris commented May 4, 2020

Uh oh!

bashtage commented May 5, 2020

Uh oh!

charris commented May 12, 2020

Uh oh!

Uh oh!

charris commented May 12, 2020

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

charris commented May 13, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

dubzzz May 4, 2020 •

edited

Loading

fzeiser May 5, 2020 •

edited

Loading

dubzzz May 6, 2020 •

edited

Loading