BUG: Get full precision for 32 bit floating point random values.#20314
BUG: Get full precision for 32 bit floating point random values.#20314seberg merged 2 commits intonumpy:mainfrom
Conversation
The formula to convert a 32 bit random integer to a random float32,
(next_uint32(bitgen_state) >> 9) * (1.0f / 8388608.0f)
shifts by one bit too many, resulting in uniform float32 samples always
having a 0 in the least significant bit. The formula is corrected to
(next_uint32(bitgen_state) >> 8) * (1.0f / 16777216.0f)
Occurrences of the incorrect formula in numpy/random/tests/test_direct.py
were also corrected.
Closes numpygh-17478.
d754442 to
4b9e569
Compare
|
Change is safe. Whether it requires a release not only matters if you think the ls bit in a 32 bit float is worth one. Probably best to be safe. |
|
I added a release note about the change. |
|
Here's a script that demonstrates the changes in the variates that can occur, and shows why a release note is warranted. Script to print random samplesimport numpy as np
print(f"numpy version {np.__version__}")
print()
seed = 98765432109
print(f"seed: {seed}")
print()
print("rng.random")
rng = np.random.default_rng(seed)
x = rng.random(12, dtype=np.float32)
print("first 12 samples:")
print(x)
print()
n = 500_000_000
print("rng.standard_exponential")
rng = np.random.default_rng(seed)
x = rng.standard_exponential(size=n, dtype=np.float32)
x1 = x[-1]
x = rng.standard_exponential(size=n, dtype=np.float32)
x2 = x[-1]
print(f"last sample of {n:10d}:", x1)
print(f"last sample of {2*n:10d}:", x2)
print()
k = 2.5
print(f"rng.standard_gamma, k={k}")
rng = np.random.default_rng(seed)
x = rng.standard_gamma(2.5, size=n, dtype=np.float32)
print("first 14 samples:")
print(x[:14])
print(f"last sample of {n}:", x[-1])The output for the current main development branch: The output for this pull request: You can see the small variation in the ULP of the output of The outputs for |
|
Thanks @WarrenWeckesser! As far as I see, this only affects the new API, since this is However, since the streams do change I am removing the backport candidate label. Please just re-add if you disagree! |
|
This seems to be nearly an enhancement, even though is also a bug. While the intent was clearly to provide the maximum number of random bits, the nature of the bug only resulted in slightly less random values in most plausible scenarios. |
The formula to convert a 32 bit random integer to a random float32,
(((rng)->next_uint32((rng)->state) >> 9) * (1.0f / 8388608.0f))
shifts by one bit too many, resulting in uniform float32 samples always
having a 0 in the least significant bit. The formula is corrected to
(((rng)->next_uint32((rng)->state) >> 8) * (1.0f / 16777216.0f))
See numpy/numpy#20314 for more details.
The formula to convert a 32 bit random integer to a random float32,
(((rng)->next_uint32((rng)->state) >> 9) * (1.0f / 8388608.0f))
shifts by one bit too many, resulting in uniform float32 samples always
having a 0 in the least significant bit. The formula is corrected to
(((rng)->next_uint32((rng)->state) >> 8) * (1.0f / 16777216.0f))
See numpy/numpy#20314 for more details.
The formula to convert a 32 bit random integer to a random float32,
shifts by one bit too many, resulting in uniform float32 samples always
having a 0 in the least significant bit. The formula is corrected to
Occurrences of the incorrect formula in numpy/random/tests/test_direct.py
were also corrected.
Closes gh-17478.