[Dream] Towards standard source code formatting

I’ve never used a Haskell formatter that didn’t affect how I wrote Haskell. When you’re considering how the formatter will mangle your code and changing how you solve the problem to avoid it (e.g. making it less point-free, not using a certain operator) then that formatter is being too big for its britches. I should’ve have that extra cognitive load - a formatter is supposed to reduce cognitive load.

I suspect if you want a standard formatter, it will have to be more “partial” than gofmt. Pick a few common things to standardize on (imports, type signatures, control syntax) and still allow the programmer to meaningfully use whitespace to manually format programs.

You’ll never capture and respect all the ways people use whitespace productively, so you’ll inevitably muck up someone’s code. This in general is why I’ve never bothered with a Haskell formatter ever - even though I use formatters in other langs.

That’s the big advantage Go and Rust have for automated formatters - those syntaxes are ugly as sin. Elm doesn’t have ugly syntax (it’s close to Haskell), but it’s formatter makes all Elm code sooo ugly.

I guess when people choose formatters, it’s in a setting where you are in the realm of brutal efficiency trade-odds like that (big corporate/open-source project + don’t trust all those programmers to have good taste) so it’s worth it to also make Haskell a little uglier :stuck_out_tongue:

4 Likes

This is essentially the point in the design space that stylish-haskell takes. WRT whitespace, Ormolu does let the programmer drive the formatter by the use of newlines, to some degree.

4 Likes

To be a little contrary: I don’t really understand the appeal of integrating every dev tool as a subcommand of one master tool.

If cabal fmt, with a formatter: <implementation> field in the .cabal file, is a desirable thing, are the following also desirable? They would also benefit from the build tool’s knowledge of what files are source files, and enable total newbies not to have to figure out what is the best external tool from the full universe of possibilities.

  • cabal commit, with a version-control: git|hg field in the .cabal file
  • cabal text-search, with a text-search: grep|rg field in the .cabal file

If these are bad ideas, why isn’t cabal fmt also? Is it just because Rust and Go have set the example and now it’s an expectation for people coming from those ecosystems?

3 Likes

I understand your point, and the proposal to add all these dev tools is somewhat contrary to the unix spirit of “one tool - one task”. But to go to the other extreme, we could also avoid using programming-language specific build tools like cabal and stack and only use general purpose build tools like make, bazel, nix etc. I think there are two reasons why we should have a fmt command but not a commit or text-search command. First, an experienced programmer from another language will know how to use git and grep, but won’t know how the ecosystem of Haskell looks like, so including them into the build tool increases discoverability. Second, these tools need information from the build system (which files to format and which to ignore), and since they are not integrated they have to use the workaround with git ls-files or find, so integration improves usability.

3 Likes

I agree, with the caveat that that declaration should not be specific to cabal, so that also stack can use this information. Maybe it would be possible to agree on a standard file name format.yaml which is accepted by all source formatters. The only specification for that file would be that it has to start with the line formatter: xxx, where xxx can be any of the current formatters. After that first line come the configuration options of the specific tool. Fourmolu, stylish-haskell and hindent use the yaml file format, and ormolu doesn’t have a configuration file.

cabal fmt seems like a perfect candidate for the external command system for cabal we’ve got in progress, as discussed at An external command system for cabal: what would you do with it?

(and so does cabal commit for that matter).

11 Likes

@alanz Was discussing what it would take for GHC to dogfood one of the formatting tools, and I would frankly be much more interested in seeing that happen that this.

Asking the community to just agree on one such tool doesn’t really work.

Asking GHC to dogfood a formatting tool is a great way to make that tool better (since the tools themselves depend on GHC), and I think can set the stage for there later being community consolidation.

6 Likes

I agree, with the caveat that that declaration should not be specific to cabal, so that also stack can use this information.

Most formatters have their own config files. Perhaps you don’t like that, but it doesn’t seem so bad to me. In that situation the build-tool-specific configuration is one line, which also seems okay to me.

cabal fmt seems like a perfect candidate for the external command system for cabal we’ve got in progress

If we wanted to have a formatter: stanza in cabal.project then I think that would need to be really wired into cabal, though?

On a relatively smaller scale, cabal adopted automatic formatting with fourmolu.

Main pain point I witnessed is outstanding (at the time of reformatting) pull requests: depending on how stale they are, it could be time-consuming to rebase.

5 Likes

Excellent. Yes such dogfooding is how we get to standardization.

Since several people have already said that it will be impossible to get everyone to sign up to one blessed code formatting standard. This was not really my central point, and I think I haven’t articulated that clearly enough in my first post. I do not have strong opinions on code formatting styles, only that I personally like to work with a code formatter since for me personally this removes the time sink of trying to hand-optimize layout. But I do have the strong opinion that since we have no agreed-upon standard way of saying whether a codebase adheres to a specific style, we currently have no way to use the same command in all codebases that we work on. That is, in some codebases we have to invoke fourmolu, in others ormolu or hindent or stylish-haskell. If we would just have some agreed-upon way to declare our preferences as a first step, then it would be possible to add subcommands to cabal and stack (either directly or via plugins) to automatically install and invoke the right formatter with a command invocation that is the same for all formatters.

3 Likes

to automatically install and invoke the right formatter with a command invocation that is the same for all formatters.

If you’re going into “automatic install” territory, and you like/are interested in nix, you might enjoy the idiom that I’ve seen some people do which is to make a nix flake “app” that runs the formatter with the appropriate args. It’s a bit overly complicated because you can’t set args in the app directly (I don’t know why) but there are workarounds. If I’m honest I don’t love doing it (I tend to just use a Makefile), but it at least gets you a standard interface for running/installing it.

eh we should maybe extend cabal.project with x- fields that these external tools can use.

3 Likes

I have switched from being anti to being pro-formatting, after starting on a codebase that had machine formatting by default, and realising how liberating it is. Bang in some code anyhow, hit format and move on. No hassles (it works even better if the language uses braces and the like, rather than layout).

And I think the specific style of formatting being used is a red herring. The advantage comes from having a standard style, which can be quickly and easily applied. Whatever the style is does not really matter.

I believe when Elixir introduced a formatter as standard, they asked people to just use it for a year, then they would address feedback. By then everyone was used to it, and saw the benefits, and there wasn’t further conflict.

And if it is built into the tooling, different projects can choose different formatters. And I suspect over time a small set would become standard, accepted by people as “the way code looks”, and well maintained.

13 Likes

Preface: My experience with auto-formatting is generally horrible (granted in a different language, namely Scala), because I’m the type of developer who tends to refactor a lot of shared code, because I tend to work on shared infrastructure code.

It’s horrible because it leads to absurdly large diffs (and hence git conflicts) because it will tend to re-flow code for no good reason. Now, Haskell is more resilient to this effect when using the whitespace sensitive syntax, but it’s not immune due to e.g. aligning ‘<-’ preferences (and similar wrt. [ and , placement for lists.)

The thing is: I wouldn’t have a problem accepting ‘arbitrary’ choices as long it wouldn’t lead to absurd headaches figuring out what changed when I have a merge conflict.

(This may ultimately be a problem with Git and similar line-based-conflict resolution, but git is the world we live in… and I happen to value git extremely highly when you want to maintain a high standard of source provenance, etc.)

1 Like

As I think you’re alluding to here… I don’t think there’d be any problem with a “format:” field in a cabal file specifying which (cabal-compiled) formatter to use. There might be something to figure out about which version (and its dependencies and such), but that should be figure-outable.

I think the point of the ormolou formatting choices is to make them diff-friendly. Hence the extensive use of newlines

As a maintainer of Fourmolu, I have to admit that we (and Ormolu) sometimes split things across lines in ways which can be rather ugly and even make diffs worse by changing the indentation of whole blocks.

5 Likes

Interesting, I also refactor a lot of shared code that is formatted with Ormolu and I don’t really experience this as a problem. I’m not sure why that might be. Perhaps it’s because Ormolu is better for diffs than the Scala formatter you use. I never get “absurdly large” diffs with Ormolu (at least not large when viewing with git diff -w. Alternatively, it might be a difference in workflow. I tend to make “absurdly small” commits which tend to be easier to resolve when conflicts arise. Additionally I have a well-specified approach to resolving rebase conflicts.

FYI it seems that brittany may actually be maintained: tomejaguar comments on Good Haskell code formatters?