Why imperative code is worse than functional code

[…] perhaps we’re misunderstanding each other.

…yes, I was. I was thinking too abstractly :-D

For real houses, then only the second example is valid Haskell:

(removeTrash'' <=< cleanSink'' <=< washDishes'') :: House'' -> IO House''

…with the correct type/s, of course. And those would be actions outputting to a global/shared state, namely the set of I/O registers of the computer running the “cleaning program”.

Because that mutable resource (those I/O registers) is potentially available everywhere in the program, there’s no (legitimate!) way to encapsulate its use. So that choice to use sequencing now spreads out to all definitions which rely on I/O, all the way up to Main.main.

I think trying to avoid state is a mistake. You have a dependency on previous results. Any solution that “avoids state”, is going to be state in disguise, probably just using the previous result and an extraction function to hide it (e.g instead of (Int, Int) -> (Section, (Int, Int)) or State (Int, Int) Section, it’ll be Section -> Section and a function Section -> (Int, Int))

1 Like

Never mind snippets, there was a whole methodology: Jackson Structured Programming. And a technique ‘Common Action Tail’. (You can even apply JSP to C++ – section 1.1.2. Take a stiff drink before looking there.)

The idea was that after a many-branching case analysis to deal with your business logic, there’s often a need for some sort of ‘tidy up’ to return the answer plus clear out any temp file access or records/variables. Rather than calling some common routine (with all that unnecessary stack-handling) just jump/goto the tail of the module. There is one place to guarantee the tidiness.

That is the whole point of Haskell and similar languages, make those implicit assumptions explicit in signatures.

foo :: StdGen -> (a, a) -> (StdGen, a)

Now you can — among other things — use a simple substitution model to reason about your function, test it thoroughly in isolation, etc. It would be not so easy with

foo :: (a, a) -> a

Do we though? The main seemingly sequential part of the original problem was the generation of the sequence of natural numbers [1..]. But you can do that in parallel chunks:

concatMap (\i -> [i, i + 1 .. i + k - 1]) [1, 1 + k ..]

Using this approach we could number an array of lessons in parallel.

But it gets more complicated for the case of nested lists like this simplified version of the problem I suggested in the other thread:

Define f :: [[a]] -> [[(Int, a)]] such that for all a :: Type and x :: [[a]] we have map fst (concat (f x)) == [1 .. length x].

Then we would like to split the generated list of natural numbers into lists that are the same length as the sublists of the input:

f xs = zipWith zip (_split_ nats) xs

Again, we can do that relatively efficiently if the lessons are stored in arrays which can report their length in constant time.

You’ve defined breakOn in terms of groupBy. groupBy is defined in terms of span which has the following type

span :: (a -> Bool) -> [a] -> ([a], [a])

or, equivalently

span :: (a -> Bool) -> State [a] [a]

In fact, we can use this to give an alternative formulation of groupBy which at first glance looks very different, but is actually pretty much the same as the definition in base.

{-# LANGUAGE LambdaCase #-}
import Control.Monad.State
import Data.Coerce
import Data.Functor.Identity -- to coerce through `Identity`

-- A little helper
next :: forall a . State [a] (Maybe a)
next = coerce @([a] -> (Maybe a, [a])) $ \case
  [] -> (Nothing, [])
  (x:xs) -> (Just x, xs)

groupBy' :: (a -> a -> Bool) -> State [a] [[a]]
groupBy' eq = next >>= \case
  Nothing -> pure []
  Just x -> do
    ys <- coerce $ span (eq x) -- it really is State [a] [a]
    ((x:ys):) <$> groupBy' eq

groupBy'' :: forall a . (a -> a -> Bool) -> [a] -> [[a]]
groupBy'' eq = fst @_ @[a] . coerce (groupBy' eq)

At the end of the day, you’re still using the “state monad” to solve this problem.

edit: add groupBy'' to demonstrate groupBy' is actually close in type to groupBy

So is zip (or spiritual equivalents) hiding state?

Instead of an accumulator holding a temporary state, we’re holding a list whose starting point is clipped by zip.

Is (enumFromTo ‘a’ ‘z’ `zip` enumFromTo 1 26) stateful?


Since, in my mind, the stateless solution to this is based on zips. If we consider zip stateful, then, perhaps it’s impossible to have a stateless solution to this problem.

1 Like

Got rid of the spans here.

{-# LANGUAGE ScopedTypeVariables #-}

chunkUntil :: forall a. (a -> Bool) -> [a] -> [[a]]
chunkUntil cond [] = []
chunkUntil cond list = case chunk of
    [] -> remainder
    a -> a : remainder
  where
    chunk :: [a]
    chunk = takeWhile (not . cond) list
    
    remainder :: [[a]]
    remainder = case rawRemainder of
        (a:as) -> case chunkUntil cond as of
            [[]] -> [[a]]
            [] -> [[a]]
            ((b:bs):cs) -> (a:b:bs):cs
            (([]):cs) -> [a]:cs
        [] -> []
        
    rawRemainder :: [a]
    rawRemainder = dropWhile (not . cond) list

Still got hidden state?

I’ll also remark, the takeWith / dropWith is hideously unperformant (I can’t expect GHC to optimize this into a single loop), so the spans implementation of groupBy is fine on the “performance” argument for use of imperative / stateful code in Haskell.

1 Like

Now that you’ve got me thinking, how do you like this zipWith based solution?

data Section n = Section
  { sname :: String
  , lessons :: [Lesson n]
  , restart :: Bool
  , sectionNum :: n
  } deriving Show

data Lesson n = Lesson
  { lname :: String
  , lessonNum :: n
  } deriving Show

numberSections :: [Section ()] -> [Section Int]
numberSections secs = result
  where
    awaitNumber :: Section () -> (Int, Int) -> Section Int
    awaitNumber sec (sn, ln) = sec
      { sectionNum = sn
      , lessons = zipWith (\l n -> l { lessonNum = n }) (lessons sec) [if restart sec then 1 else ln ..]
      }
    cursor :: Section Int -> (Int, Int)
    cursor sec = (sectionNum sec + 1, last (1:map lessonNum (lessons sec)) + 1)
    result :: [Section Int]
    result = zipWith ($) (map awaitNumber secs) ((1,1): map cursor result)
1 Like

My thing about state was mainly spawned by you calling the “state monad” imperative. If you’re happy with it, I guess it doesn’t matter. I wasn’t suggesting that you avoid state, I was trying to see if I could convince you that it’s not inherently “imperative”.

Your zipWith sort of includes state via iterating the map cursor result and creating a quasi state that gets updated (cursor).

Unless, perhaps, what we are really looking to do is:

-Establish a stateless version of the algorithm, that is not necessarily performant.
-Establish a stateful version of the algorithm that may be cleaner, and may be more performant.
-Use equational reasoning to prove that they have the same semantics, if not implementation, then go with the stateful version for pragmatic reasons?


Perhaps that’s the dog bone we need to throw to purity?


TBH, this comes down to “are anamorphisms not functional because they generate a seed state which they effectively mutate”? And I guess that’s going too far. Point made.

If our definition of ‘stateful’ encompasses every computation that can be massaged into a computation in State with only some isomorphisms and projections, that covers an awful lot of computations.

Consider that the Kleisli arrow in State is a -> s -> (s, b). Flipped and uncurried, that’s (s, a) -> (s, b). Up to isomorphism, State composition is merely function composition. Basically, any function whose domain and codomain are both isomorphic to product types with a common, non-trivial factor is ‘smuggling state’.

I wonder if a more useful definition of ‘statefulness’ is a measure of how much of your (s, a) -> (s, b) computation is made out of second (f :: a -> b) and how much of it actually uses the common s. The essence of a ‘stateful’ computation is that most of it doesn’t explicitly handle s; the state lurks in the background, unnoticed, causing problems for people who try to do equational reasoning on the pure component without taking the state component into account. The more that s is explicitly handled, the less ‘stateful’ I consider the computation to be. (Somewhat paradoxically, this defines the extreme where s is ignored entirely as ‘the most stateful’, which is maybe not great.)

1 Like

Correct - since most code in Haskell does directly interact with I/O-centric state, then by that reasoning Haskell is an extremely (I/O) “stateful” language! But paradoxes can be useful, sometimes even amusing e.g. Conal’s proclamation of C being purely functional :-D

[…] the state lurks in the background, unnoticed, causing problems for people who try to do equational reasoning on the pure component without taking the state component into account.

It depends on the definition:

  • if the definition mostly lifts regular Haskell functions into the (stateful) context, then equational reasoning in those lifted functions is preserved.

  • if not - the definition relies heavily on that context - then caution should be exercised in use of equational reasoning, as noted in Out of the Tar Pit:


It certainly would - by that definition, splitAt :: Int -> [a] -> ([a], [a]) would be considered stateful: the second list of the resulting pair being the (remaining) state. Could I therefore suggest that it isn’t the use of state alone that’s the problem, but its potential to taint other definitions?

Consider (again :-) the monadic ST type - it could just be left to promulgate practically everywhere in a program, all the way up to Main.main (with a call to stToIO :: ST RealWorld a -> IO a). Fortunately, there’s also runST: a definition can have a denotative/declarative/functional (etc! ) interface while having a stateful implementation. So a few “small pockets” of privately-stateful code scattered far and wide aren’t really a problem.

I think this might be a good moment to reflect a bit on what “imperative” even means. The way I see it, “imperative” can refer to a number of related but distinct concepts, most importantly:

  • Reasoning about code in terms of commands (“statements”) and execution control flow, a.k.a. “reasoning operationally”. That is, we think about our programs in terms of what happens when, and what the consequences of that are. “Imperative”, here, is a mental model of code execution, or a mindset we can use while reading, writing, designing, analyzing, etc., code, and we can use this mindset regardless of the language we’re using, or the mindset that the code was written in. I can take a perfectly declarative Haskell program and analyze it imperatively, e.g. because I want to figure out why it performs badly, so I go and reason about what steps the compiler and runtime take to evaluate those expressions. The opposite is also possible, though the equational complexity of programs written with an imperative mindset is often too complex to make it feasible. (This is the essence of the “downfall” argument voiced here earlier).
  • A coding style that highlights the imperative viewpoint. This usually follows naturally from coding with an imperative mindset, simply because we like to structure our code to match our mental model of it as closely as possible, so someone who thinks operationally will structure their code to emphasize execution order, like a step-by-step procedure or recipe, while someone who thinks equationally will try to make the code look like a set of equations. This, too, is to some extent independent of the programming language used: we can write in a step-by-step-procedure style in Haskell (e.g. using state monads, or simply by ordering let bindings or where clauses in the order we conceptualize them to be evaluated - even if that’s not the actual order in which things end up being evaluated, it does reflect how the programmer thinks about the code), or we can write assignments and procedure definitions in, say, C or Python, in such a way that they read more or less like equations. Mind you, this is not about what actually happens when we run the code, it’s about mirroring our mental model of the program in the syntactic structure.
  • Using (shared) mutable state and other side effects. The idea here is that any side effect essentially ruins equational reasoning, because what used to be expressions that behave pretty much like Mathematical terms (modulo bottoms, but hey) are now effectful “statements” (or “commands”), and what used to be functions (again, behaved a lot like their Mathematical namesakes) are now “procedures”. In this sense, “imperative code” means “code that uses mutable state and/or other side effects”, and “functional code” means “code that doesn’t use any side effects”. (Side note here: I’m saying side effects, because you can have effects in functional code, but you cannot trigger them via evaluation - the only thing you can do with those effects is compose them, and then eventually you end up with an effect that describes your entire program, and that effect can then be triggered externally. In other words, Haskell’s execution model of pointing the effectful RTS to a pure expression of type IO ()).
  • Using an “imperative programming language” - which is a programming language designed for programming with mutable state and side effects, for coding in a style that reflects an imperative viewpoint on the code, and for facilitating operational reasoning. E.g., C is an imperative language, because its built-in functionality and standard library rely heavily on side effects, because idiomatic code is usually structured to highlight execution order and operational semantics, and because it was designed to closely follow the execution model of a 1960s/70s era computer like the PDP-11, which, like most viable digital computer architectures since the 1950s, is by and large based on the von Neumann model, and thus deeply committed to “sequence of commands and destructive updates” as the primary framework for defining programs. By contrast, Haskell is not; while it makes imperative thinking possible, and provides plenty of primitives for programming with mutable state and side effects, its design is clearly geared towards equational reasoning and a functional model, and the compiler needs to do some impressively complicated things to convert our functional/declarative representations of programs into something a von Neumann machine can execute efficiently.

But depending on which of these meanings we presume, we can end up with different conclusions as to whether something “is imperative” or not.

Are State Monads “imperative”? In the “reasoning about code” sense, I would argue that yes, they very much are, because the main reason we use them is so that we can reason about state in terms of “what happens when”, even though the “when” refers to a conceptual/logical axis that reflects data dependencies more than actual time or execution order (i.e., when we say “B happens after A” in a State Monad context, what we really mean is “B has a data dependency on A, so in order to evaluate B, A must be evaluated”). And at least when we use do notation with out State Monads, I would argue that such code is also “imperative” in the sense of “highlighting an imperative mental model” - because let’s be honest here, making functional code look quasi-imperative is the main reason why do notation exists, and one of the main purposes of State Monads is so that we can use do notation to express data dependencies in a quasi-imperative style.
However, in the “mutable state and other side effects” sense, State Monads are not imperative - there are no destructive updates, no side effects, nada - we’re still just passing arguments through an expression graph, nothing is updated in-place, there are no side effects, and all our State s a expressions are themselves still referentially transparent.

8 Likes

Hey don’t go knocking the PDP-11. And if you can find a machine to compile/execute Haskell that isn’t “by and large based on the von Neumann model”, I’m sure we’d all like to know.

You’re getting too far down into the weeds. We need only look at the HL programming language itself/its semantics:

  • If shuffling the sequence of statements can make another valid program that produces a different result, that’s imperative.
  • If shuffling makes no difference; or merely gives a source that doesn’t compile, that’s declarative.
  • (Yeah I didn’t tell how to recognise a “statement”: you know what I mean.)
  • You can still reason about imperative code; but your reasoning has to consider order of execution/substituting the appearance of a variable with its RHS assignment is hazardous. [**]
  • In Declarative code it should always work to substitute in the RHS.

[**] Like Hoare logic, which is notoriously hard to work with, so inspired Robin Milner to LCF then ML.

Are you sure? Doesn’t laziness make it really hard to analyze imperatively? – that is unless you put strictness pragmas all over the place. (And then you risk forcing bottoms for results you didn’t need.)

Of course it’s hard, but nonetheless, it’s possible, and sometimes (unfortunately) necessary, because GHC is not a Sufficiently Smart Compiler, and the imperative von Neumann stuff bleeds through the functional abstraction sometimes. However, IME it is (usually) still easier than the other way around.

A point worth mentioning in this context however is that when you do that, you are probably no longer analyzing the program as an implementation-agnostic Haskell program, you are looking at it as a Haskell program as GHC would compile it and the GHC RTS would execute it - given a different (but compliant) Haskell compiler, the imperative interpretation of the declarative code might look different, due to the compiler taking a different execution strategy. But in practice, that’s OK, because the main reason for putting on the imperative goggles is usually because you want to learn about the program’s performance (memory usage, execution time, …), and that is going to be highly dependent on the choices a specific compiler makes.

Oh, I have nothing but respect for those machines of yore and the brave souls who made them do incredible things.

Well, but that depends on the definition of “sequence” and “statement”, doesn’t it.
Pretty much every programming language has some way of expressing actions to be executed sequentially - in a typical imperative language, that way is ubiquitous and used for almost everything, while in a functional language like Haskell, it is presented as an emergent property of certain constructs like monadic binds, function composition, lists, etc., and of course changing the order of those things does change the program’s behavior. [1,2,3] and [2,3,1] are clearly not the same program.

This is why I think it’s important to differentiate the notion of imperative code from that of an imperative language. I can write imperative code in Haskell, I can write (somewhat) functional code in JavaScript, but I would not usually call Haskell an “imperative programming language”, or JavaScript a “functional programming language”, because while it is possible to write in the respective idiom in either language, it isn’t obvious, idiomatic, or particularly easy.

This is a v good point : ) haha