Instances and warnings by lehins · Pull Request #58 · idontgetoutmuch/random

lehins · 2020-03-30T00:58:51Z

I finally got some time to add more improvements to our initiative. There are quite a few changes in this PR. I'll add some comments directly into the code for some of the decisions I made so you guys can respond back if you don't like any of those decisions.

First and foremost this PR includes implementation for all instance thus removing all warnings.
It simplifies the default implementation of RandomGen and MonadRandom, while keeping it backwards compatible (I was able to compile and run the test suite for pcgen with no changes to the code)
Improves haddock, tests and coverage

System/Random.hs

lehins · 2020-03-30T01:05:15Z

System/Random.hs

+-- let rolls :: Int -> [Word]
+--     rolls n = runGenState_
 --               (PCGen 17 29)
--               (\g -> randomListM g 10 >>= \xs -> return $ map ((+1) . (`mod` 6)) xs)


I added uniformListM as an exported function, it seems like it could be useful

I'm a fan of keeping the API surface small, particularly given that this is a core package (for some definition of "core"). "seems like it could be useful" is not strong enough IMO. I would prefer it if we didn't export uniformListM.

There is randoms, so I though it only made sense to add a monadic version of it. Also I think generating a list of values is gonna be an extremely common operation. Hell it even creeped up in documentation ;)

It seems to be have been removed from the example - fine with me

I removed it because using uniformR is a correct way of generating random die throws, rather than mod 6, cause it is biased that way.

System/Random.hs

lehins · 2020-03-30T01:11:49Z

System/Random.hs

+      (l32, g') ->
+        case genWord32 g' of
+          (h32, g'') ->
+            ((fromIntegral h32 `unsafeShiftL` 32) .|. fromIntegral l32, g'')


This approach assumes that older instances of RandomGen class generate 32bits or less of randomness at one time. It will still work for 64bits too, but it will reduce the quality of RNG to 32bits.

Another approach would be to switch to this:

genWord64 = randomIvalIntegral (minBound, maxBound)

But has terrible performance and is not good default implementation for RNG that define genWord32.

But this is the default so users can always replace it with something better if it is available?

@idontgetoutmuch That was my thinking as well.

lehins · 2020-03-30T01:13:47Z

System/Random.hs

-bitmaskWithRejection ::
-     (RandomGen g, FiniteBits a, Num a, Ord a, Random a)
+uniformIntegerM :: (MonadRandom g m) => (Integer, Integer) -> g -> m Integer
+uniformIntegerM (l, h) gen


I just converted randomIvalIntegral to the MonadRandom approach without trying to dive deeper of how the actual generation works, so if someone feels like analysing or improving the algorithm of Integer generation that would be great

I (think) I understand the algorithm: it is a variant of Classic Modulo which can be badly biased. By multiplying by the arbitrary 1000 this biasing becomes a 1000 times less.

Example of biasing: in this example with 8 bits of entropy, some numbers get picked 4/3 as often, multiplying by 1000 \approxeq 10 bits, some numbers get picked 4473/4472 as often. Of course the underlying RNG then has to do more work as well as still being (a bit) biased.

Data.Map.fromListWith (const (+1)) $ Prelude.map ((\x -> (x, 0)) . (`mod` 57)) [0..255] (`mod` 57)) [0..255] fromList [(0,4),(1,4),(2,4),(3,4),(4,4),(5,4),(6,4),(7,4),(8,4),(9,4),(10,4),(11,4),(12,4),(13,4),(14,4),(15,4),(16,4),(17,4),(18,4),(19,4),(20,4),(21,4),(22,4),(23,4),(24,4),(25,4),(26,4),(27,4),(28,3),(29,3),(30,3),(31,3),(32,3),(33,3),(34,3),(35,3),(36,3),(37,3),(38,3),(39,3),(40,3),(41,3),(42,3),(43,3),(44,3),(45,3),(46,3),(47,3),(48,3),(49,3),(50,3),(51,3),(52,3),(53,3),(54,3),(55,3),(56,3)] fromList [(0,4473),(1,4473),(2,4473),(3,4473),(4,4473),(5,4473),(6,4473),(7,4473),(8,4473),(9,4473),(10,4473),(11,4473),(12,4473),(13,4473),(14,4473),(15,4473),(16,4473),(17,4473),(18,4473),(19,4473),(20,4473),(21,4473),(22,4473),(23,4473),(24,4473),(25,4473),(26,4473),(27,4473),(28,4473),(29,4473),(30,4473),(31,4473),(32,4473),(33,4473),(34,4473),(35,4473),(36,4473),(37,4473),(38,4473),(39,4473),(40,4472),(41,4472),(42,4472),(43,4472),(44,4472),(45,4472),(46,4472),(47,4472),(48,4472),(49,4472),(50,4472),(51,4472),(52,4472),(53,4472),(54,4472),(55,4472),(56,4472)]

I should also add this function takes a generator with entropy less than 42(!) bits and makes it produce 42 bits of entropy. So if e.g. you have a generator with 1 bit of entropy then it will be invoked 42 times by this function.

A 1 bit of entropy generator if you ever need it

data BinRNG = BinRNG StdGen instance RandomGen BinRNG where next (BinRNG g) = (x `mod` 2, BinRNG g') where (x,g') = next g split (BinRNG g) = (BinRNG g1, BinRNG g2) where (g1,g2) = split g genRange _ = (0,1)

System/Random.hs

curiousleo · 2020-03-30T07:35:11Z

System/Random.hs

 Minimal complete definition: 'randomR' and 'random'.

 -}
-{-# DEPRECATED randomR "In favor of `uniformR`" #-}


Why do you think we should un-deprecate randomR?

I removed it for now, because it results in too many warnings everywhere. It looks like it is a very popular function. I think we should wait until next iteration on deprecating it.

curiousleo · 2020-03-30T07:42:38Z

System/Random.hs

+-- let rolls :: Int -> [Word]
+--     rolls n = runGenState_
 --               (PCGen 17 29)
--               (\g -> randomListM g 10 >>= \xs -> return $ map ((+1) . (`mod` 6)) xs)


I'm a fan of keeping the API surface small, particularly given that this is a core package (for some definition of "core"). "seems like it could be useful" is not strong enough IMO. I would prefer it if we didn't export uniformListM.

System/Random.hs

curiousleo · 2020-03-30T07:53:21Z

System/Random.hs

+  | otherwise = (bottom +) . fromUnsigned <$>
+    bitmaskWithRejectionM uniform range gen
+    where
+      range = toUnsigned top - toUnsigned bottom


I think it's worth explaining in a comment how this is exploiting overflow to get the correct behaviour.

I'd love to explain it but I am not sure I can explain it with words. It's just the math used works out with modulo arithmetic, that's all.

Okay, I'll give it a try:

-- The following works out thanks to modular arithmetic. Let @n@ be the -- width of the signed and unsigned integer type in question. -- -- Conceptually, -- -- toUnsigned x | x >= 0 = x -- toUnsigned x | x < 0 = 2^n + x -- -- The following combinations are possible: -- -- 1. @bottom >= 0@ and @top >= 0@ -- 2. @bottom < 0@ and @top >= 0@ -- 3. @bottom < 0@ and @top < 0@ -- -- Note that @bottom >= 0@ and @top < 0@ is impossible because of the -- invariant @bottom < top@ enforced in this equation by the guard in the -- function definition. -- -- For any signed integer @i@ of width @n@, we have: -- -- -2^(n-1) <= i <= 2^(n-1) - 1 -- -- Considering each combination in turn, we have -- -- 1. @bottom >= 0@ and @top >= 0@ -- -- range = (toUnsigned top - toUnsigned bottom) `mod` 2^n -- --^ top >= 0, so toUnsigned top == top -- --^ bottom >= 0, so toUnsigned bottom == bottom -- = (top - bottom) `mod` 2^n -- --^ top <= 2^(n-1) - 1 and bottom >= 0 -- --^ top - bottom <= 2^(n-1) - 1 -- --^ 0 < top - bottom <= 2^(n-1) - 1 -- = top - bottom -- -- 2. @bottom < 0@ and @top >= 0@ -- -- range = (toUnsigned top - toUnsigned bottom) `mod` 2^n -- --^ top >= 0, so toUnsigned top == top -- --^ bottom < 0, so toUnsigned bottom == 2^n + bottom -- = (top - (2^n + bottom)) `mod` 2^n -- --^ summand -2^n cancels out in calculation modulo 2^n -- = (top - bottom) `mod` 2^n -- --^ top <= 2^(n-1) - 1 and bottom >= -2^(n-1) -- --^ top - bottom <= (2^(n-1) - 1) - (-2^(n-1)) = 2^n - 1 -- --^ 0 < top - bottom <= 2^n - 1 -- = top - bottom -- -- 3. @bottom < 0@ and @top < 0@ -- -- range = (toUnsigned top - toUnsigned bottom) `mod` 2^n -- --^ top < 0, so toUnsigned top == 2^n + top -- --^ bottom < 0, so toUnsigned bottom == 2^n + bottom -- = ((2^n + top) - (2^n + bottom)) `mod` 2^n -- --^ summand 2^n cancels out in calculation modulo 2^n -- = (top - bottom) `mod` 2^n -- --^ top <= -1 -- --^ bottom >= -2^(n-1) -- --^ top - bottom <= -1 - (-2^(n-1)) = 2^(n-1) - 1 -- --^ 0 < top - bottom <= 2^(n-1) - 1 -- = top - bottom range = toUnsigned top - toUnsigned bottom

That is the kind of comments that should never appear in functions, IMHO, since they hide the simplicity of the actual code in the complexity of the comment. What we can do though is add this as an appendix, possibly at the end of the module or something, for those that actually do care reading long comments like that. I have a friend Niklas that loves long commentaries like that, he is also German, is that a cultural thing? :)

curiousleo · 2020-03-30T07:54:11Z

tests/Spec.hs

-main = defaultMain $ testGroup "Spec"
-    [ bitmaskSpecWord32, bitmaskSpecWord64
-    , rangeSpecWord32, rangeSpecDouble, rangeSpecFloat, rangeSpecInt
+main =


Nice! Thanks for improving the tests here.

curiousleo · 2020-03-30T13:32:53Z

I pushed a tiny change to run spec tests for Bool and Char as well: feac91f - hope you don't mind.

lehins · 2020-03-30T13:45:51Z

@curiousleo Not at all. Good catch.

…prove documentation

…rove tests and coverage

curiousleo

LGTM, happy to see this merged.

I have two suggestions which I've put into separate PRs: #63 and #64. We can merge them into instances-and-warnings before merging instances-and-warnings into interface-to-performance, or we can merge this PR now and I'll re-target #63 and #64 at interface-to-performance. I don't mind.

Mark all bitmask functions with signed/unsigned

Explain why range is computed correctly

idontgetoutmuch · 2020-04-01T10:11:33Z

@curiousleo and I have looked at various algorithms for ranges and the bitmask approach is good but not the best - the unbiased integer multiplication approach seems to be about x2 better depending on the benchmark. These benchmarks are in C(++) so not directly comparable. For ranges for 32 bit integers you need 64 bit words and for ranges for 64 bit integers you need 128 bit words which C supports but Haskell doesn't (at least I couldn't find out how to do it efficiently - there are various libraries around e.g. https://hackage.haskell.org/package/largeword-1.2.5/docs/Data-LargeWord.html and maybe with enought INLINE and SPECIALIZE pragmas these can be made efficient but my brain is not big enough).

My suggestion is we use the faster (unbiased integer multiplication) for 32 bit integers and use the bitmask approach for 64 bit integers.

idontgetoutmuch · 2020-04-01T15:21:23Z

System/Random.hs


  genWord32R :: Word32 -> g -> (Word32, g)
-  genWord32R m = randomIvalIntegral (minBound, m)
+  genWord32R m g = runGenState g (unsignedBitmaskWithRejectionM uniformWord32 m)


I propose

genWord32R m g = runGenState g (unbiasedIntMult32 m) unbiasedIntMult32 :: MonadRandom g m => Word32 -> g -> m Word32 unbiasedIntMult32 r g = go where t :: Word32 t = (-r) `mod` r -- Calculates 2^32 `mod` r!!! go = do x <- uniformWord32 g let m :: Word64 m = (fromIntegral x) * (fromIntegral r) l :: Word32 l = fromIntegral m if (l >= t) then return (fromIntegral $ m `shiftR` 32) else go

System/Random.hs

… in tests

Uncomment legacy tests

Resurrect legacy benchmarks

lehins · 2020-04-04T20:10:14Z

Ok, I think we all agree on changes in this PR, I am merging this one and retargeting all branches that depend on it

Instances and warnings

lehins force-pushed the instances-and-warnings branch from 583b375 to 3b86554 Compare March 30, 2020 01:03

lehins commented Mar 30, 2020

View reviewed changes

System/Random.hs Show resolved Hide resolved

lehins commented Mar 30, 2020

View reviewed changes

System/Random.hs Show resolved Hide resolved

lehins commented Mar 30, 2020

View reviewed changes

System/Random.hs Show resolved Hide resolved

curiousleo reviewed Mar 30, 2020

View reviewed changes

curiousleo mentioned this pull request Mar 30, 2020

Implement StdGen.genWord{8,16} #55

Closed

curiousleo mentioned this pull request Mar 30, 2020

Make bitmaskWithRejectionM inclusive in the upper bound #62

Merged

lehins and others added 6 commits March 30, 2020 18:54

Simplify default implementation of both RangomGen and MonadRandom. Im…

3676a17

…prove documentation

Adopt previous Integer range generator for UniformRange instance. Imp…

dc6257b

…rove tests and coverage

Instances and tests for CFloat and CDouble

c5c96c6

Fix doctests and stop an example writing a file in current dir

30ede52

Also test Char and Bool

b342e25

Make bitmaskWithRejectionM inclusive

64c0667

lehins force-pushed the instances-and-warnings branch from ff9ba76 to 64c0667 Compare March 30, 2020 15:55

curiousleo mentioned this pull request Mar 31, 2020

RandomGen: generate bitstring #43

Closed

1 task

Mark all bitmask functions with signed/unsigned

621c93e

curiousleo mentioned this pull request Mar 31, 2020

Mark all bitmask functions with signed/unsigned #63

Merged

Explain why range is computed correctly

c24c443

curiousleo mentioned this pull request Mar 31, 2020

Explain why range is computed correctly #64

Merged

curiousleo approved these changes Mar 31, 2020

View reviewed changes

lehins added 2 commits March 31, 2020 17:26

Merge pull request #63 from idontgetoutmuch/signed-unsigned-bitmask

e1af331

Mark all bitmask functions with signed/unsigned

Merge pull request #64 from idontgetoutmuch/explain-modulo

4bb71f0

Explain why range is computed correctly

idontgetoutmuch self-requested a review April 1, 2020 10:11

idontgetoutmuch reviewed Apr 1, 2020

View reviewed changes

System/Random.hs Show resolved Hide resolved

lehins and others added 4 commits April 3, 2020 05:39

Add instance for CBool and ensure correct instances for foreign types…

bbcee73

… in tests

Remove dead code

52784a5

Uncomment now-working tests

8669a69

Test more interesting Integer range

a81d241

idontgetoutmuch mentioned this pull request Apr 3, 2020

Coordination: changes to PRNG libraries #23

Open

8 tasks

curiousleo and others added 4 commits April 3, 2020 13:43

Resurrect legacy benchmarks

ce73692

Fix Coveralls CI build

d34570b

Merge pull request #68 from idontgetoutmuch/uncomment-tests

b7be4da

Uncomment legacy tests

Merge pull request #69 from idontgetoutmuch/legacy-benchmark

9574756

Resurrect legacy benchmarks

lehins merged commit 7173a80 into interface-to-performance Apr 4, 2020

curiousleo deleted the instances-and-warnings branch April 16, 2020 08:47

curiousleo pushed a commit that referenced this pull request May 19, 2020

Merge pull request #58 from idontgetoutmuch/instances-and-warnings

1611123

Instances and warnings

Conversation

lehins commented Mar 30, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

idontgetoutmuch Apr 2, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

idontgetoutmuch Apr 2, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

curiousleo commented Mar 30, 2020

Uh oh!

lehins commented Mar 30, 2020

Uh oh!

curiousleo left a comment

Choose a reason for hiding this comment

Uh oh!

idontgetoutmuch commented Apr 1, 2020

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

lehins commented Apr 4, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

lehins commented Mar 30, 2020 •

edited

Loading

idontgetoutmuch Apr 2, 2020 •

edited

Loading

idontgetoutmuch Apr 2, 2020 •

edited

Loading