[Dream] Towards standard source code formatting

This thread is inspired by the discussion in https://discourse.haskell.org/t/convenience-in-the-haskell-ecosystem . But that thread is already sprawling, so I thought it would be better to open a second thread which could be more focused on one specific issue: code formatting.

The question that I want to ask is what separates the current state from a possible future where we can say:

cabal fmt

and

stack fmt

and have our code formatted. This is of course inspired by other languages which have this feature, and which is often cited as one of their big benefits. E.g. go fmt or cargo fmt.
The benefit that is cited is precisely the convenience of code formatting, not its availability. The big technical challenge has already been solved, there are code formatters for Haskell. So the remaining challenge of making it convenient is not as much technical as it is social. We have to find a consensus that this is both desirable, and that we can make it happen in a way that satisfies everyone.

What makes code formatting currently inconvenient

For a seasoned Haskell developer, code formatting is not terribly inconvenient. The majority of the users on this board probably know the available formatters, how to install them, and also how to invoke them. But for someone new to Haskell there are currently 3 things which together make it so inconvenient that probably most of them wonā€™t use it on their first projects.

  1. You have to know which code formatters are available and well-maintained. Also, which of them are popular and which are obscure. Those which are based on haskell-src-exts should be avoided, etc.
  2. You have to know how to install Haskell binaries and make them available on the path. Again, this is trivial for experienced Haskell developers, but if you are just starting out with Haskell you have to look this stuff up.
  3. You have to know how to invoke them. If we take ormolu and fourmolu as an example, then they have decided on purpose that they wonā€™t try to understand your build system. As a consequence, you have to explicitly invoke them with the Haskell files you want to format. The READMEs suggest using xmolu --mode inplace $(git ls-files '*.hs') or xmolu --mode inplace $(find . -name '*.hs'). I am working a lot of beginners who are just starting out with git and the console. Of course I can tell them to type in those commands, but for a lot of them it will just be copy-pasted code. Just try to think back to the time when you were learning about globbing, shell expansion, commandline arguments etc.

What stands in the way

The technical challenges are probably pretty minor, at least compared to the work that went into implementing the code formatters themselves. The key challenges are probably social and cultural.
The obvious first one is which code formatter to pick. Brittany is officially unmaintained, and stylish-haskell is much more restricted in scope. (The README says: ā€œThe goal is not to format all of the code in a file[ā€¦]ā€.) So the only two really viable alternatives currently are ormolu and fourmolu.(EDIT: I forgot about hindent, which is also still maintained.) The latter is a fork of the former, they share the vast majority of the code and only fundamentally disagree about whether code formatting should be configurable or not. But, very naively, shouldnā€™t we be able to find a compromise where we say that without a configuration file cabal/stack fmt behaves like ormolu, and with a configuration file it behaves like formolu? I donā€™t have any insight into their respective development, but their relationship doesnā€™t seem to be an acrimonious one.

The other question would be how to make cabal fmt and stack fmt happen. Do the maintainers of those tools agree that they should provide such a command. Should it be via a plugin system like the one recently proposed for cabal, should it directly depend on a library or should it invoke an external executable? If it is an external executable, how should it be distributed such that beginners donā€™t have to know about installation. Via the GHC distribution, ghcup or cabal install/stack install invoked in the background?

Closing

As the title indicates, it is currently only a dream inspired by the discussion in the other thread. I donā€™t think it poses any deep technical challenges. Instead it would require someone with the standing in the community and moderating capabilities to make it happen. Someone who would be able to bring the involved actors around a table and create a consensus.

12 Likes

I feel the major problem with a standard Haskell formatter is social, not technical. I feel the reason go fmt and cargo fmt are so popular is that they were present very early on: right from the beginning of the language, they determined a single ā€˜standardā€™ format, and everyone since then has gotten used to that.

Haskellers, by contrast, have for 25 years been diverging on matters formatting-related. Even in base, one can find completely different code styles ā€” even within a single file, e.g. System.IO:

readIO          :: Read a => String -> IO a
readIO s        =  case (do { (x,t) <- reads s ;
                              ("","") <- lex t ;
                              return x }) of
                        [x]    -> return x
                        []     -> ioError (userError "Prelude.readIO: no parse")
                        _      -> ioError (userError "Prelude.readIO: ambiguous parse")

-- vs:

fixIO :: (a -> IO a) -> IO a
fixIO k = do
    m <- newEmptyMVar
    ans <- unsafeDupableInterleaveIO
             (readMVar m `catch` \BlockedIndefinitelyOnMVar ->
                                    throwIO FixIOException)
    result <- k ans
    putMVar m result
    return result

So, even if we somehow manage to get a cabal fmt command, I strongly suspect it will never be able to become ā€˜standardā€™ in the same way as formatters for Go or Rust have.

2 Likes

I fully agree that code formatting will probably not become as ubiquitous in Haskell code bases as in Rust or Go codebases. There will probably be users who donā€™t want to use code formatters, and that is totally fine. But I think we can improve the situation for those users who do want to use code formatters. Most of them have already converged on just two. What I really want is that we would be able to say: ā€œYou can use code formatting or not, but if you want to use code formatting, the standard one is super convenient to useā€.

4 Likes

Agreed, that would be really great!


To write down some more thoughts of mine on Haskell code formatters:

As you say, Brittany is unmaintained, and stylish-haskell incomplete, leaving ormolu and fourmolu. But personally, I find both their formatting styles quite obnoxious (for lack of a better word), even after twiddling fourmoluā€™s config ā€” I always feel it adds too many newlines and indents everywhere. Iā€™ve yet to see a Haskell formatter which approaches what Iā€™d consider natural Haskell code, though of course other people may differ on this point.

2 Likes

Fourmolu maintainer here! Thereā€™s also hindent, which should still be maintained.

General thoughts:

7 Likes

Oh, I missed hindent. That was not an intended omission, sorry :confused: Yes, what you are describing are exactly the problems I had in mind. What I am wondering is whether it would be possible to forge an agreement which takes those disagreements about the correct formatting style into account, but where everyone agrees that these disagreements are less important than finding a solution which makes it possible to have cabal fmt and stack fmt on by default. For example a common yaml configuration format which corresponds to:

data FormatOptions = Ormolu | Fourmolu FourmoluOpts | Hindent HindentOpts

If it is impossible to agree on a default, maybe people would be ok with the following workflow:

$ cabal/stack fmt
No formatting configuration found. Please choose a style to use by adding a format.yaml which contains at least
formatter: fourmolu
formatter: ormolu
formatter: hindent

If you choose formatter: ormolu then you cannot add any other configuration options.

  • Iā€™ve also seen Ormolu users morally opposed to even the option of a config file, as it becomes a potential point for bikeshedding

These users would have to choose to configure that they chose ormolu, but they are already making that specific choice by installing ormolu instead of fourmolu or hindent on their system, and configuring that in CI. So this would just be the canonical point to document that decision.

4 Likes

I think trying to pick a blessed formatter is going to be quite difficult. Apart from anything else, it seems to me that the evolution of the tool ecosystem is largely driven by maintainership. Tools die when they lose maintainers, and new ones appear when a keen person is willing to devote a lot of time to making a tool. If you bless a standard tool, you had better have a plan for keeping it maintainedā€¦

(Some googling around rustfmt turned up this, which is quite on-topic! Reddit - Dive into anything)

I think the alternative is to make it easy to get cabal fmt to do what you want. That would mean something like:

  1. A way to tell cabal what formatter you want
  2. A standard interface so cabal can invoke said formatter.

In that world you can say in your cabal.project

formatter: fourmolu

and have cabal fmt work, without us having to bless a formatter, just a formatter-interface.

For bonus points, make cabal better at exposing its settings so HLS can get this information, and maybe we can make HLS automatically use the right formatterā€¦

20 Likes

Iā€™ve never used a Haskell formatter that didnā€™t affect how I wrote Haskell. When youā€™re considering how the formatter will mangle your code and changing how you solve the problem to avoid it (e.g. making it less point-free, not using a certain operator) then that formatter is being too big for its britches. I shouldā€™ve have that extra cognitive load - a formatter is supposed to reduce cognitive load.

I suspect if you want a standard formatter, it will have to be more ā€œpartialā€ than gofmt. Pick a few common things to standardize on (imports, type signatures, control syntax) and still allow the programmer to meaningfully use whitespace to manually format programs.

Youā€™ll never capture and respect all the ways people use whitespace productively, so youā€™ll inevitably muck up someoneā€™s code. This in general is why Iā€™ve never bothered with a Haskell formatter ever - even though I use formatters in other langs.

Thatā€™s the big advantage Go and Rust have for automated formatters - those syntaxes are ugly as sin. Elm doesnā€™t have ugly syntax (itā€™s close to Haskell), but itā€™s formatter makes all Elm code sooo ugly.

I guess when people choose formatters, itā€™s in a setting where you are in the realm of brutal efficiency trade-odds like that (big corporate/open-source project + donā€™t trust all those programmers to have good taste) so itā€™s worth it to also make Haskell a little uglier :stuck_out_tongue:

4 Likes

This is essentially the point in the design space that stylish-haskell takes. WRT whitespace, Ormolu does let the programmer drive the formatter by the use of newlines, to some degree.

4 Likes

To be a little contrary: I donā€™t really understand the appeal of integrating every dev tool as a subcommand of one master tool.

If cabal fmt, with a formatter: <implementation> field in the .cabal file, is a desirable thing, are the following also desirable? They would also benefit from the build toolā€™s knowledge of what files are source files, and enable total newbies not to have to figure out what is the best external tool from the full universe of possibilities.

  • cabal commit, with a version-control: git|hg field in the .cabal file
  • cabal text-search, with a text-search: grep|rg field in the .cabal file

If these are bad ideas, why isnā€™t cabal fmt also? Is it just because Rust and Go have set the example and now itā€™s an expectation for people coming from those ecosystems?

3 Likes

I understand your point, and the proposal to add all these dev tools is somewhat contrary to the unix spirit of ā€œone tool - one taskā€. But to go to the other extreme, we could also avoid using programming-language specific build tools like cabal and stack and only use general purpose build tools like make, bazel, nix etc. I think there are two reasons why we should have a fmt command but not a commit or text-search command. First, an experienced programmer from another language will know how to use git and grep, but wonā€™t know how the ecosystem of Haskell looks like, so including them into the build tool increases discoverability. Second, these tools need information from the build system (which files to format and which to ignore), and since they are not integrated they have to use the workaround with git ls-files or find, so integration improves usability.

3 Likes

I agree, with the caveat that that declaration should not be specific to cabal, so that also stack can use this information. Maybe it would be possible to agree on a standard file name format.yaml which is accepted by all source formatters. The only specification for that file would be that it has to start with the line formatter: xxx, where xxx can be any of the current formatters. After that first line come the configuration options of the specific tool. Fourmolu, stylish-haskell and hindent use the yaml file format, and ormolu doesnā€™t have a configuration file.

cabal fmt seems like a perfect candidate for the external command system for cabal weā€™ve got in progress, as discussed at An external command system for cabal: what would you do with it?

(and so does cabal commit for that matter).

11 Likes

@alanz Was discussing what it would take for GHC to dogfood one of the formatting tools, and I would frankly be much more interested in seeing that happen that this.

Asking the community to just agree on one such tool doesnā€™t really work.

Asking GHC to dogfood a formatting tool is a great way to make that tool better (since the tools themselves depend on GHC), and I think can set the stage for there later being community consolidation.

6 Likes

I agree, with the caveat that that declaration should not be specific to cabal, so that also stack can use this information.

Most formatters have their own config files. Perhaps you donā€™t like that, but it doesnā€™t seem so bad to me. In that situation the build-tool-specific configuration is one line, which also seems okay to me.

cabal fmt seems like a perfect candidate for the external command system for cabal weā€™ve got in progress

If we wanted to have a formatter: stanza in cabal.project then I think that would need to be really wired into cabal, though?

On a relatively smaller scale, cabal adopted automatic formatting with fourmolu.

Main pain point I witnessed is outstanding (at the time of reformatting) pull requests: depending on how stale they are, it could be time-consuming to rebase.

5 Likes

Excellent. Yes such dogfooding is how we get to standardization.

Since several people have already said that it will be impossible to get everyone to sign up to one blessed code formatting standard. This was not really my central point, and I think I havenā€™t articulated that clearly enough in my first post. I do not have strong opinions on code formatting styles, only that I personally like to work with a code formatter since for me personally this removes the time sink of trying to hand-optimize layout. But I do have the strong opinion that since we have no agreed-upon standard way of saying whether a codebase adheres to a specific style, we currently have no way to use the same command in all codebases that we work on. That is, in some codebases we have to invoke fourmolu, in others ormolu or hindent or stylish-haskell. If we would just have some agreed-upon way to declare our preferences as a first step, then it would be possible to add subcommands to cabal and stack (either directly or via plugins) to automatically install and invoke the right formatter with a command invocation that is the same for all formatters.

3 Likes

to automatically install and invoke the right formatter with a command invocation that is the same for all formatters.

If youā€™re going into ā€œautomatic installā€ territory, and you like/are interested in nix, you might enjoy the idiom that Iā€™ve seen some people do which is to make a nix flake ā€œappā€ that runs the formatter with the appropriate args. Itā€™s a bit overly complicated because you canā€™t set args in the app directly (I donā€™t know why) but there are workarounds. If Iā€™m honest I donā€™t love doing it (I tend to just use a Makefile), but it at least gets you a standard interface for running/installing it.

eh we should maybe extend cabal.project with x- fields that these external tools can use.

3 Likes