Dealing with repeated transformers in a transformer stack using 'dependent types'

Void · September 28, 2024, 6:57am

It is common to encounter a stack of monad transformers, in which there are multiple of the same transformer. An example is StateT s1 (StateT s2 m), where you stack StateT twice on a monad. In that case, the typeclass functions:

get :: (MonadState s m) => m s
put :: (MonadState s m) => s -> m ()

don’t really work on the inner state s2. StateT s does not propagate an existing MonadState instance, because it introduces a MonadState instance itself, so if it did propagate an existing MonadState instance, you get conflicting instances.

This problem is only partially solved by effect systems. Using the package effectful, I think it is fine to write code such as:

foo :: (State Int :> es, State String :> es, IOE :> es) => Eff es ()
foo = do
  n :: Int <- get
  s :: String <- get
  liftIO $ replicateM_ n (putStrLn s)

but it is not fine to have two repeated effects:

bar :: (State Int :> es, State Int :> es, IOE :> es) => Eff es ()
bar = do
  n :: Int <- get
  m :: Int <- get
  liftIO $ print (n * m)

bar will not work as intended to multiply two distinct integer states.

Although a short stack StateT s1 (StateT s2 m) can be reduced to StateT (s1, s2) m, it is not really clean to write fst <$> get or snd <$> get when you want to access a part of the state, and to use put, you have to introduce lenses to get any remotely clean code. In more complicated transformer stacks like StateT s1 (ExceptT e (ReaderT r (StateT s2 m))), it may be not viable to perform such merging at all, and to access the inner state, you have to write lift $ lift $ lift $ get, a rather unenjoyable task that also reduces the readability of the code.

I propose a method to deal with this issue. Here, we turn on the language extensions:

{-# LANGUAGE DataKinds #-}
{-# LANGUAGE AllowAmbiguousTypes #-}
{-# LANGUAGE ScopedTypeVariables #-}
{-# LANGUAGE FunctionalDependencies #-}
{-# LANGUAGE UndecidableInstances #-}

Now we define the usual natural number kind using DataKinds, ready to exploit those limited dependent type features that Haskell has:

data Nat = Z | S Nat

and parametrize the typeclass MonadState over Nat:

class (Monad m) => MonadState' (n :: Nat) s m | m n -> s where
  get' :: m s
  put' :: s -> m ()

To access get' and put' for a specific n, use TypeApplications. For monads that introduce a new state, we correspond the new state with MonadState' 'Z:

instance (Monad m) => MonadState' 'Z s (StateT s m) where
  get' = get
  put' = put

Next, for monads that does not introduce a new state, we simply propagate MonadState' through it:

instance (MonadState' n s m) => MonadState' n s (ReaderT s1 m) where
  get' = lift (get' @n)
  put' x = lift (put' @n x)

Finally, to propagate MonadState' through a monad that introduces a state, we replace the Nat parameter with its successor while propagating:

instance (MonadState' n s m) => MonadState' (S n) s (StateT s1 m) where
  get' = lift (get' @n)
  put' x = lift (put' @n x)

With these definitions, it is possible to write code that uses multiple states that is polymorphic over typeclasses, such as:

swapState :: forall n p s m. (MonadState' n s m, MonadState' p s m) => m ()
swapState = do
  sn <- get' @n
  sp <- get' @p
  put' @p sn
  put' @n sp

addNums :: forall n1 n2 n3 s m. (Num s, MonadState' n1 s m, MonadState' n2 s m, MonadState' n3 s m) => m ()
addNums = liftA2 (+) (get' @n1) (get' @n2) >>= (put' @n3)

(The explicit foralls are for ScopedTypeVariables to work.) I don’t think such code is possible before, because you can only have a single MonadState s instance before, and to use two in typeclass constraints, you don’t know how many times to lift the inner one. Indeed, this code worked as expected:

main :: IO ()
main = do
  let test1 = do {
    swapState @Z @(S Z) ;
    addNums @Z @(S Z) @(S Z) ;
    ask >>= \r -> liftIO $ putStrLn $ "Reader: " ++ r ;
    get' @Z >>= \s -> liftIO $ putStrLn $ "State 0: " ++ show s ;
    get' @(S Z) >>= \s -> liftIO $ putStrLn $ "State 1: " ++ show s
  }
  let reduce = void . (`runStateT` 114) . (`runReaderT` "1919810") . (`runStateT` 400)
    in reduce test1

This program started with state (400, 114), swapped them to (114, 400) and added the first state to the second state to get (114, 514), which exactly matches the output. Hence, the correct states are lifted properly through the transformer stack.

Furthermore, this upgrade is also very backwards-compatible. It is possible to use the type synonyms (under ConstraintKind):

type MonadState s m = MonadState' 'Z s m
get :: MonadState s m => m s
get = get' @Z

put :: MonadState s m => s -> m ()
put x = put' @Z x

Then the constraint MonadState defined this way propagates exactly like the old MonadState, and all old code would work without modifications.

lortabac · September 28, 2024, 7:55am

Smart idea, but I think there are simpler solutions to this problem.

The low-tech approach would be to use newtypes to avoid ambiguities. That’s what I’d recommend.

Alternatively you can tag types with a phantom parameter. There was a library that proposed tagged transformers, I can’t find it on Hackage anymore.

Both solutions have the advantage of naming things explicitly instead of relying on error-prone indexes.

tomjaguarpaw · September 28, 2024, 8:16am

This problem is completely solved by Bluefin.

Void:

[in effectful] it is not fine to have two repeated effects:

bar :: (State Int :> es, State Int :> es, IOE :> es) => Eff es ()
bar = do
  n :: Int <- get
  m :: Int <- get
  liftIO $ print (n * m)

In Bluefin it is:

bar ::
  (e1 :> es, e2 :> es, e3 :> es) =>
  State Int e1 -> State Int e2 -> IOE e3 -> Eff es ()
bar s1 s2 io = do
  n <- get s1
  m <- get s2
  effIO io (print (n * m))

In fact, this is one of the reasons that Bluefin uses value level handles at all.

eddiemundo · September 28, 2024, 6:14pm

You can also used Effectful.Labeled in effectful.

rampion · October 3, 2024, 12:31pm

What prevents the compiler from assuming e1 ~ e2?

edit: ah, I see. handles.

Swordlash · October 5, 2024, 1:09pm

Is it, really? I usually treat StateT as a tool for introducing convenient state handling in a localised portion of code. Otherwise I usually work in abstract setting of some MonadCapabilityXXX and CapabilityXXXT transformer which is usually a newtype over ReaderT, as @lortabac mentioned. So even if I have a stack of readers, each is actually newtyped with named functionality. I’ve honestly never seen a transformer repeated in production code monadic stack.

Some languages like Python introduce named arguments to avoid ambiguity between arguments of the same type. Haskell fixes such issues with lightweight newtypes, so why not use them?

tomjaguarpaw · October 5, 2024, 2:44pm

Perhaps because they’re too cumbersome to use?

The answer from a Bluefin point of view is “lightweight newtypes are still not as lightweight as function arguments”.