Supercede's House Style for Haskell

Here’s a style guide I wrote recently which is partly my opinion, and partly observations on what has organically emerged as the way we typically write Haskell code on our team. It describes syntax, but goes beyond that because style encompasses all parts of our work.

What do you think?

18 Likes

I like to put my “block opening” keywords* at the end of a line. In particular, I write where like this:

hangingIndent :: MonadIO m => a -> m b
hangingIndent a = fromAToB a where
  fromAToB = do
    someEffect
    someOtherEffectForNoReason
    thisCodeLooksABitWeird

* let, of, where, do

2 Likes

I both wholeheartedly agree, and swear to never do this at work.

On the whole, I agree that manually controlling the layout of code can help with readability. It’s not necessarily about aesthetics, but rather with organizing information. Just like a good module hierarchy is important for code navigation (among other things), code layout helps visual navigation. I personally like the style presented in the blog post, and would enjoy reading source code in this layout.

But unfortunately, people’s view of a good layout is quite personal. In a work setting, I always highly recommend a non-configurable code formatter (ormolu for Haskell and black for Python, for example). The goal is to maximize global readability by homogeneizing code written by different people, in contrast to maximizing layout on a per-person basis for local readibility.

In a workplace with a mandatory style, isn’t it basically like having a code formatter, but with the extra steps of having to enforce some rules manually?

12 Likes

The first half prescribing how to layout code is something I’ve always considered a problem to automate away.[1]

I do think some of the other sections are valuable and can be justified with practical reasoning:

  • Avoiding wildcards because it can ensure you handle new cases.
  • Tests should be self contained and meta code vs object code should be clearly distinguished.

They have a flavour of more systems thinking.

Another pattern I’ve observed:

  • Keep functions seated in monads on the lowest practical layer of the stack, for better composition and performance e.g.
    • The Query monad (eg Rel8) can be composed into one big query which requires only one round trip to non-local DBs (PG, etc.).
    • A Transaction (eg Hasql) monad obviously composes into a transaction which has atomicity guarantees, but clearly doesn’t compose in the same way as Query.
    • And then a possible Model monad which access DBs, files, HTTP APIs, etc can be called from many contexts (a web handler, a task scheduler, command line, test suite, etc.).
    • Implement Auth on whatever layer needed.
    • Instead, you often see all code in one big App type, and N+1 query problems everywhere, with a web redirect in the middle of a data accessing pattern, dirty DB reads, etc. and it’s very difficult to undo later on.

Haskell makes it easy to cheaply model this in the types, and I think it’s worth doing. It can make code more atomic, perform better, and be more testable. But it’s harder to automate, you have to remember to do it, as a team. (Or perhaps LLMs can help…)


  1. See this paragraph about my work on autoformatters. ↩︎

8 Likes

This style choice is fairly pervasive and I don’t understand why:

doTheThing ::
     MonadIO m
  => MonadLogger m
  => UserId -- ^ The currently logged in user
  -> CompanyId -- ^ The company the user wishes to foo bar baz
  -> SqlPersistT m ()

Some of those types are arguments, some are constraints, and one is the return type. At a glance, which is which? Which one looks special? (MonadIO m, because it has no arrow ahead of it.) Where does the kind of the types look like it changes? (CompanyId is the first type to have a -> arrow on its line.) But the special type here is SqlPersistT m (), because it’s the only thing in this signature that is an output. And the first argument type is not CompanyId but UserId.

I like to format multiline signatures like this, therefore:

doTheThing ::
  MonadIO m =>
  MonadLogger m =>
  UserId -> -- The currently logged in user
  CompanyId -> -- The company the user wishes to foo bar baz
    SqlPersistT m ()

Am I a monster? Why is this not the convention? (Aside from circular ‘because the tooling doesn’t support it’ reasons, I mean.)

9 Likes

Isn’t it what ormolu does?

1 Like

Not being an ormolu user, I didn’t know that!

From the code I’ve seen on Hackage, it still seems like a minority preference—though I’m happy to see that it’s not quite as small a minority as I feared!

And I still prefer marking the return type with a double-indent over what ormolu did at the time of that commit (maybe they’ve adopted that convention also since then).

I agree. It was a revelation to me when ormolu chose to put the arrows after arguments. I found it weird at first but now it makes a lot of sense to me.

As @bodigrim says, that’s the ormolu style, apart from the final indent. I guess I could get used to your way, but ormolu suits me fine for now.

What are “meta code” and “object code” here?

1 Like

I was using that as a short-hand; meta as “analysis of something at a higher level” (scaffold, setup/teardown, assertions, fake data, config, …) and object as “a concrete thing within a system” (functions, modules, services, …). AKA the test harness and the system under test.

1 Like

It is gaining ground. I used to like the old way, but I like being able to grep for definitions with a regex like ^name ::. Now that linear types have landed, arrows etc are no longer uniformly two characters, which means the neat vertical alignment is not guaranteed.

1 Like

I would be interested in views on how to style functions with large numbers of arguments. For example, the Stack project, which respects 80-character lines for the most part, has code formatting like the following for functions that yield an action:

-- | Perform the actual plan
executePlan ::
     HasEnvConfig env
  => BuildOptsCLI
  -> BaseConfigOpts
  -> [LocalPackage]
  -> [DumpPackage] 
     -- ^ global packages
  -> [DumpPackage] 
     -- ^ snapshot packages
  -> [DumpPackage] 
     -- ^ project packages and local extra-deps
  -> InstalledMap
  -> Map PackageName Target
  -> Plan
  -> RIO env ()
executePlan
    boptsCli
    baseConfigOpts
    locals
    globalPackages
    snapshotPackages
    localPackages
    installedMap
    targets
    plan
  = do
    logDebug "Executing the build plan"
    bopts <- view buildOptsL
    withExecuteEnv
      bopts
      boptsCli
      baseConfigOpts
      locals
      globalPackages
      snapshotPackages
      localPackages
      mlargestPackageName
      (executePlan' installedMap targets plan)

    ...

Are there ‘better’ ways?

1 Like

One argument in favour of short lines in code (e.g. 80 characters or less) is that it makes the code much easier to read on an iPhone with the GitHub app.

1 Like

Another argument in favour of keeping the :: on the same line as the function name is that it is syntax highlighter-friendly (when it comes to colouring the function name).

2 Likes

The original post is great, thanks.

I don’t use a code formatter, but I’m onboard with it being a good idea sooner or later.

But I always strongly favour breaking the 80 char limit and using longer lines when needed (within reason), with line wrapping usually turned off (ie, with too-long lines truncated). Because seeing clear code structure, and more of it, is much more valuable than fitting in narrow horizontal space. I usually don’t need to be seeing the end of every line, instead I want to see more of the program. When I do want to see line ends, it’s easy to temporarily maximize a window, toggle line wrap, or scroll.

1 Like

Haddock indeed didn’t understand this in the past (up until GHC 8.10, it would even fail to parse it back then), but since GHC 9.0, Haddock can parse it just fine and does consider the comment here to apply to CompanyId.

4 Likes

Ormolu (which I use but sometimes find rather frustrating) deals with this by using -- | comments:

foo ::
  -- | The company the user wishes to foo bar baz
  CompanyId ->
  Whatever
2 Likes

It’s pretty good.

The space before MonadIO and several => look unusual in

doTheThing ::
     MonadIO m
  => MonadLogger m
  => UserId -- ^ The currently logged in user
  -> CompanyId -- ^ The company the user wishes to foo bar baz
  -> SqlPersistT m ()
doTheThing userId companyId = _

I would use a more uniform

doTheThing 
  :: (MonadIO m, MonadLogger m)
  => UserId -- ^ The currently logged in user
  -> CompanyId -- ^ The company the user wishes to foo bar baz
  -> SqlPersistT m ()
doTheThing userId companyId = _

For records I prefer to put the constructor on the new line:

data User
  = User
    { userName :: UserName
    , userEmail :: Email
    , userDateOfBirth :: Day
    }
{- so it will look uniform if we add
  | AnotherConstructor
    { foo :: Foo
    , ...
    }
-}

I’m not a fan of automatic formatting. I think people can express their intent better than any formatter. Unless there is something very irritating (like random spaces and identation), I can tolerate almost any Haskell formatting.

The most important thing is how clean the ideas behind the code are. I’ve seen a lot of nicely formatted spaghetti, and worked with a guy who uses tabs like this. I’d prefer tabs.

Surprisingly, books that have nothing to do with FP are still helpful in developing a good style of thinking. I would recommend “A Philosophy of Software Design” by Ousterhout, “Pragmatic Programmer”, “The Art of Doing Science and Engineering”, and “The Art of Unix Programming”.

Closer to Haskell I would recommend to use as much pure code as possible and spend time thinking on the task at hand instead of creating overengineered “frameworks”. Yes, it’s easier and more fun to create the 1001st effects system (and you can always do it at home), but it’s no fun to maintain it.

5 Likes

Wouldn’t

Design for qualified imports. If you have module called Email,
it should probably expose a function called parse so that it
can be imported qualified and called with Email.parse,
rather than the clumsy Email.parseEmail.

imply

data User = User
  { name :: UserName
  , email :: Email
  , dateOfBirth :: Day
  }

rather than

data User = User
  { userName :: UserName
  , userEmail :: Email
  , userDateOfBirth :: Day
  }

?

1 Like

I suppose it’s not uncommon to want multiple fields of the same name within a module, especially with something short like name. In general, working with unprefixed field names is quite annoying without NoFieldSelectors, which has been around for a few years now, but I get the impression* it hasn’t been that widely used. And to be fair, converting a large pre-existing codebase to that style could be a lot of work for a pretty small payoff.

* I’d be interested in firm numbers, but alas.

All compiler warnings should be upgraded to errors.

Do you mean to say that you enable -Werror everywhere, and not just on CI? I’m aware that some people do but I’ve never seen the appeal. I find that running code which temporarily contains some harmless warnings like unused variables or imports is something that I do constantly.