Design for app that heavily uses pseudorandom numbers

I’m considering Haskell for a data-intensive application.

The app needs pseudo-random numbers throughout the logic stack. Many of the low-level mathematical functions absolutely need random numbers, so it seems like the design choices are:

  • Live in a stateful monad that owns the PRNG state machine, so most of the app is impure

  • Explicitly pass random numbers into all the low-level mathematical code

  • Explicitly pass the generator into all the low-level mathematical code, and explicitly return the new PRNG state to the caller.

These all have have code smell before I’ve written the code. Is there a better way? Thanks everyone.

3 Likes

You could make a RandomM monad that is basically a Reader/State but specific to generating randomness. With a few helper functions, it could be pretty ergonomic, I think. e.g.:

newtype RandomM a = { runRandomM :: RandomGen -> (a, RandomGen) }

instance Monad RandomM where
  m >>= f = RandomM $ \gen ->
    let (a, newGen) = m `runRandomM` gen
     in f a `runRandomM` newGen

instance Applicative RandomM where
  pure = RandomM . flip (,)
  fab <*> fa = RandomM $ \gen ->
    let (f, gen2) = fab `runRandomM` gen
        (a, newGen) = fa `runRandom` gen2
     in (f a, newGen)

instance Functor RandomM where
  fmap f (RandomM a) = RandomM $ \gen -> (f a, gen)

withRandom :: Random r => (r -> a) -> RandomM a
withRandom f = RandomM $ \gen ->
    let (n, newGen) = random gen
     in (f n, newGen)

-- Example function that needs something random internally
calculation :: Arg1 -> Arg2 -> RandomM Arg3
calculation a b =
  withRandom $ \r ->
    -- if you make sure the random arg is last,
    -- this could just be @withRandom $ someFunc a b@
    someFunction r a b

-- Example usage without side-effects
app :: RandomM Int
app = do
  -- calculation using randomness
  a <- calculation 10 5
  -- impromptu extra value
  b <- withRandom $ \r -> r * a
  -- any normal function can just be a let
  let c = pureFunc a b
  withRandom $ \r -> c `div` (r `mod` 3)

N.B. I did not try to compile this, this is just straight from keyboard into text.


This would not be impure AFAICT; given the same RandomGen, this would always produce the same outcome.

You can use split to propagate unique generators in a less nasty way that State-monad-esque passing back and forth. That’s another option.

1 Like

Thank you for the thoughtful answer. The challenge here is that randomness is pervasive through the math, so the typical f(g(h(…(x)))) has to deal with the monad heavily.

I think I’ve convinced myself that there’s no good answer. Any code that implicitly uses randomness doesn’t have referential transparency and therefore isn’t a pure function, so it has to live in a monad, or I need to eliminate its implicit state by passing around the PRNG stream or generator state.

Random algorithms and PRNGs always seemed to be a wart on pure FP, hadn’t really thought it through until now.

2 Likes

…or a library:

It’s one of several implementations using the technique described in the functional pearl On generating unique names (1994):

The technique reduces the need for sequencing by only requiring supplies to be arguments - there’s no need to explicitly return a supply as a component of a larger result (this being dealt with by the library definitions).

1 Like

Isn’t this a code smell on its own? I mean, it doesn’t matter the language. If you want ref. transparency you have to explicitly set the seed value.

Is this a problem? If your PRNG is lifted into your application in a monadic fashion, you can compose with Kleisli composition so instead of f . g . h $ x you do f <=< g <=< h $ x (given than f, g, and h use and modify the PRNG properly)

1 Like

Is this a problem?

For some of us, yes:

1 Like

This is the style I recommend too

Depends on your definition of randomness :wink:

data G a = G a a a

randG :: forall a. Hashable a => Uniform a => a -> a -> G a
randG x y =
  let gen = mkStdGen (hash x `xor` hash y)
      (z, _) = uniform gen
   in G x y z

How exactly would using e.g. G Double values everywhere be an improvement over using value-supply?

There are several approaches, neither of them devoid of code smell.
First, notice that <=< is the composition in Kleisli categories,
so while you can not write f(g(h(x))) you can write (f.g.h) x provided the three functions live in a Kleisli category.

Second, others have resorted to domain-specific languages, where you can write ordinary lambda functions but have a probabilistic choice operator at your disposal, too. The drawback is that these DSLs must be interpreted, and your PRNG approach is effectively one way to embed such a probabilistic lambda calculus into ordinary Haskell. Unfortunately I know no quasi-quoters that do this translation for you.

You might want to experiment with LazyPPL which internally uses a technique similar to the previously suggested value-supply package: At the core of one of the monads is an infinitely wide and deep tree of (possibly PRNG-generated) random numbers.

data Tree = Tree Double [Tree]
newtype Prob a = Prob (Tree -> a)

Regarding referential transparency, as @Ambrose mentioned it depends on the point of view: If the semantics of a variable x is a probability distribution, then we might very well call a probabilistic program referentially transparent, since one can talk about equality of distributions. If the semantics is a stream of PRNG samples, however, then one must ask what a good equality relation for these streams is.

Thank you for the idea. ValueSupply and its functional pearl are clever but the app code would still be referentially transparent, so repeated calls to the app function can’t get new random numbers.

At this point the monadic solutions seem to best satisfy the app needs, but then all almost the low-level math code must be yanked into the monad, which isn’t going to win any clean code awards.

If the generator is defined entirely in regular Haskell, yes. But since newSupply & co. require an IO context then you can also use IO to generate the initial (seed) value from an external source e.g. the system clock:

do ...
     n0 <- initialValue a b c ...
     sn <- newSupply n0 next
     ...

…which then enables the generated values to vary for each call to the program.

For the sake of example. Here there is a non-monadic composable implementation. I wonder if this scales well as the software becomes more complex.

import System.Random

-- Two calls tow the program, generate different outputs
main :: IO ()
main = do
  -- make an initial random number using getStdGen (system entropy, hence IO monad)
  initial_gen <- getStdGen
  let initial_rand_value = mkRandomValue (0 , 100) initial_gen :: RandomFloat
  
  putStrLn "Initial random value"
  print initial_rand_value -- This is your initial pseudo-random number
  
  -- Two calls to the same function generate the same result
  putStrLn "These two lines should be same, but still random-ish"
  print $ addRandom initial_rand_value 
  print $ addRandom initial_rand_value
  
  putStrLn "This line should output something random"
  print $ subRandom initial_rand_value
  
  -- The advantage, is that functions compose well.
  putStrLn "We can compose!!"
  print $ subRandom . addRandom $ initial_rand_value


-- This type allow you to get a (random) value
-- And to generate a new one from it. I think the constructor
-- should be private in order to guarantee randomness
data RandomValue a 
  = RandomValue { value :: a            -- The value
                , new :: RandomValue a  -- get a new random value
                }

-- example
type RandomFloat = RandomValue Float

-- Smart constructor. Uses laziness to store the "next random value" ad infinitum.
mkRandomValue :: UniformRange a => (a, a) -> StdGen -> RandomValue a
mkRandomValue (low, high) initial_gen = RandomValue v (mkRandomValue (low, high) new_gen)
  where (v, new_gen) = uniformR (low, high) initial_gen

-- Given a random number, add a random amount to it
addRandom :: Num a => RandomValue a -> RandomValue a
addRandom rand_float = 
  let other = new rand_float 
   in rand_float + other

-- Given a random number, subtract a random amount to it   
subRandom :: Num a => RandomValue a -> RandomValue a
subRandom rand_float = 
  let other = new rand_float
   in rand_float - other


-- Useful instances. Notice, you can't make an Applicative/Monad instance
-- because is impossible (is it, right?) to define pure/return for RandomValue. 
-- Also notice the Num instance is incomplete... precissely, because you can't 
-- define fromInteger :: Int -> RandomValue a
instance Show a => Show (RandomValue a) where
  show (RandomValue a _) = show a -- Be sure, you don't evaluate the inifite series of rand numbers!!

instance Functor RandomValue where
  fmap f (RandomValue v n) = RandomValue (f v) (fmap f n)

instance Num a => Num (RandomValue a) where
    (RandomValue v n) + (RandomValue v' _) = RandomValue (v + v') (new n)
    (RandomValue v n) - (RandomValue v' _) = RandomValue (v - v') (new n)
    (RandomValue v n) * (RandomValue v' _) = RandomValue (v * v') (new n)
    abs (RandomValue v n) = RandomValue (abs v) (new n)
    signum (RandomValue v n) = RandomValue (signum v) (new n)

Solution inspired by this episode of The Haskell Unfolder

2 Likes

True but that doesn’t go quite far enough, because we need new randomness in each function eval, during the same program invocation.

That’s why the various split... definitions exist - they produce new supplies from their respective arguments. You just need to ensure that if a definition needs a supply, it can always obtain a new one.

yourFn :: Double -> Supply Double -> Integer
yourFn d s = ... (case supplyValue s of !r -> ...) ...

yourDataList :: [Double]
yourDataPair :: (Double, Double)

resultsOf :: Supply Double -> [Integer]
resultsOf = zipWith yourFn yourData . split

twoResultsOf :: (Double, Double) -> Supply Double -> (Integer, Integer)
twoResultsOf (dat1, dat2) s = case split2 s of
                                (s1, s2) -> (yourFn dat1 s1, yourFn dat2 s2)