Regarding dealing with space leaks, my advice is the following:
-
Make invalid laziness unrepresentable. That is, design your types to be free of space leaks in the first place. In the same way you simply wouldn’t use strings
"TRUE"and"FALSE"to represent booleans, don’t usedata MyPair = Pair Int Intto represent a pair of fixed-precision integers. When evaluated it’s not a pair of evaluated fixed-precision integers! It’s a pair of (either a fixed-precision integer or thunk (potential space leak)). Instead, usedata MyPair = Pair !Int !Int.Similarly, don’t use
data MyPair2 = Pair !Int !(Maybe Int). There’s a thunk (potential space leak) hiding in thatMaybe. Instead usedata MyPair2 = Pair !Int !(Strict (Maybe Int)). (See thestrict-wrapperlibrary.) -
Use
th-deepstrictto confirm that the data types that you are defining don’t hide space leaks. -
Only use the space-leak-free versions of various library functions. This is a bit more awkward, because you have to know which to avoid. For example, you should only ever use
foldl'notfoldl,Data.IORef.modifyIORef'notData.IORef.modifyIORef, andControl.Monad.Trans.State.modify'notControl.Monad.Trans.State.modify.(Maybe one day this knowledge will be encoded into
stanor some other static analyser, so everyone doesn’t have to just remember it.) -
If you come across a space leak nonethless, use GHC’s heap profiler with retainer profiling. That should give you a good idea of which data type the space leak occurs in. Then, if it’s your data type, you can go back to 1 to fix it, perhaps using
nothunksto help diagnose. Once fixed useth-deepstrictto ensure that the data type doesn’t regress. On the other hand, if the space leak is in a library you’re using then it’s more tricky. I guess file a bug report upstream, for example my patch tomegaparsec.
I don’t think I really follow this. It’s not a question of “functions benefitting from correct types”. It’s a question of enforcing invariants on your data types (as @kosmikus explains in the linked video). If there’s no need for laziness in your data type then enforce its absence by making invalid laziness unrepresentable and it will be space leak free! The point of making invalid laziness unrepresentable is that deepseq becomes simple the same as seq. There is no longer and deep laziness to seq! deepseq is a massive anti-pattern. If you find yourself using it then something has likely gone terribly wrong. (For a discussion around the boundary between legitimate deepseq use and anti-pattern use, see Deepseq versus "make invalid laziness unrepresentable").