Regarding dealing with space leaks, my advice is the following:
-
Make invalid laziness unrepresentable. That is, design your types to be free of space leaks in the first place. In the same way you simply wouldn’t use strings
"TRUE"
and"FALSE"
to represent booleans, don’t usedata MyPair = Pair Int Int
to represent a pair of fixed-precision integers. When evaluated it’s not a pair of evaluated fixed-precision integers! It’s a pair of (either a fixed-precision integer or thunk (potential space leak)). Instead, usedata MyPair = Pair !Int !Int
.Similarly, don’t use
data MyPair2 = Pair !Int !(Maybe Int)
. There’s a thunk (potential space leak) hiding in thatMaybe
. Instead usedata MyPair2 = Pair !Int !(Strict (Maybe Int))
. (See thestrict-wrapper
library.) -
Use
th-deepstrict
to confirm that the data types that you are defining don’t hide space leaks. -
Only use the space-leak-free versions of various library functions. This is a bit more awkward, because you have to know which to avoid. For example, you should only ever use
foldl'
notfoldl
,Data.IORef.modifyIORef'
notData.IORef.modifyIORef
, andControl.Monad.Trans.State.modify'
notControl.Monad.Trans.State.modify
.(Maybe one day this knowledge will be encoded into
stan
or some other static analyser, so everyone doesn’t have to just remember it.) -
If you come across a space leak nonethless, use GHC’s heap profiler with retainer profiling. That should give you a good idea of which data type the space leak occurs in. Then, if it’s your data type, you can go back to 1 to fix it, perhaps using
nothunks
to help diagnose. Once fixed useth-deepstrict
to ensure that the data type doesn’t regress. On the other hand, if the space leak is in a library you’re using then it’s more tricky. I guess file a bug report upstream, for example my patch tomegaparsec
.
I don’t think I really follow this. It’s not a question of “functions benefitting from correct types”. It’s a question of enforcing invariants on your data types (as @kosmikus explains in the linked video). If there’s no need for laziness in your data type then enforce its absence by making invalid laziness unrepresentable and it will be space leak free! The point of making invalid laziness unrepresentable is that deepseq
becomes simple the same as seq
. There is no longer and deep laziness to seq
! deepseq
is a massive anti-pattern. If you find yourself using it then something has likely gone terribly wrong. (For a discussion around the boundary between legitimate deepseq
use and anti-pattern use, see Deepseq versus "make invalid laziness unrepresentable").