API: Add rng.spawn(), bit_gen.spawn(), and bit_gen.seed_seq#23195
API: Add rng.spawn(), bit_gen.spawn(), and bit_gen.seed_seq#23195charris merged 5 commits intonumpy:mainfrom
rng.spawn(), bit_gen.spawn(), and bit_gen.seed_seq#23195Conversation
d38cf2c to
9c25f48
Compare
76e5a9f to
6288b80
Compare
rkern
left a comment
There was a problem hiding this comment.
Generally, 👍. I think it's a good time.
The main narrative documentation about spawning should be reworked to use Generator.spawn() primarily. That can contain code examples of how it works and relates to the underlying SeedSequence so that information can be removed from the docstrings here, where concision constrains us from really explaining it clearly.
numpy/random/bit_generator.pyx
Outdated
| raise TypeError( | ||
| "The underlying SeedSequence does not implement spawning. " | ||
| "You must ensure a custom SeedSequence used for initializing " | ||
| "the random state implements spawning (and registers it).") |
There was a problem hiding this comment.
Custom SeedSequences don't have to implement spawning. Most reasons for implementing a custom SeedSequence won't support spawning; the most common reason will be to match a seeding algorithm from some other system, like the legacy MT19937 seeding. They just won't support spawning. I'd cut this message off at the first sentence.
numpy/random/_generator.pyx
Outdated
| This is a convenience method to safely spawn new random number | ||
| generators via the `numpy.random.SeedSequence.spawn` mechanism. | ||
| The original seed sequence is used by the bit generator and accessible | ||
| as ``Generator.bit_generator.seed_seq``. |
There was a problem hiding this comment.
I'm not sure what this sentence is trying to convey in this context. I'm not sure why the address of the SeedSequence is important here. On first reading, it read like it was claiming that the original SeedSequence was available on the spawned generators. I think this might work better as an Example rather than prose, if it needs to be said in this docstring.
There was a problem hiding this comment.
Felt like pointing where things come from. But, I agree that it is probably easy enough to find when a user actually wants to (and the vast majority shouldn't).
Instead added a See Also section, that seems more useful.
I gave a shot expanding the docs a bit. The main info about spawning is currently in the bit generator, right now only inserted the new approach there.
(Noted jumped in BitGenerator.spawn(), not sure it is helpful, and maybe See Also of jumped should point back to spawn?)
There was a problem hiding this comment.
The main part of the narrative docs I was thinking of was the Parallel RNG page. It could wait for another doc-focused PR, though.
There was a problem hiding this comment.
🤦 sorry, gave it a shot to expand those a bit minimally, since that is really the right place to link to. Not sure that is the best approach so happy to try and redo more.
1919b90 to
939693b
Compare
| from numpy.random import PCG64, SeedSequence | ||
| # High quality initial entropy | ||
| entropy = 0x87351080e25cb0fad77a44a3be03b491 | ||
| base_rng = PCG(entropy) |
There was a problem hiding this comment.
| base_rng = PCG(entropy) | |
| base_rng = PCG64(entropy) |
There was a problem hiding this comment.
But this is a BitGenerator, not a Generator. Use default_rng()
This makes the seed sequence interface more public facing by: 1. Adding `BitGenerator.seed_seq` to give clear access to `_seed_seq` 2. Add `spawn()` to both the generator and the bit generator as convenience methods for spawning new instances. I somewhat remember that we always meant to consider making this more public and adding such convenient methods, but did not do so originally. So, now, I do wonder whether it is time to make this fully public? It would be nice to follow up at some point with a bit of best practices. This also doesn't add it to the `RandomState`, although doing it via `RandomState._bit_generator` is of course valid. Can we define as this kind of access as stable enough that downstream libraries could use it? I fear that backcompat with `RandomState` might make adopting newer things like spawning hard for libraries?
Trying to address Robert Kerns review comments.
049bd25 to
4e6f77d
Compare
|
Thanks Sebastian. |
|
Please note that the documentation for |
This makes the seed sequence interface more public facing by:
BitGenerator.seed_seqto give clear access to_seed_seqspawn()to both the generator and the bit generator as convenience methods for spawning new instances.I somewhat remember that we always meant to consider making this more public and adding such convenient methods, but did not do so originally.
So, now, I do wonder whether it is time to make this fully public?
It would be nice to follow up at some point with a bit of best practices. This also doesn't add it to the
RandomState, although doing it viaRandomState._bit_generatoris of course valid.Can we define as this kind of access as stable enough that downstream libraries could use it? I fear that backcompat with
RandomStatemight make adopting newer things like spawning hard for libraries?This PR came out of a discussion I had today with @betatim from sklearn (not 100% related). I am not quite sure where we are on doing this (or something similar), but thought it may be time.