Hopefully he’ll be along to share.
Of course! Sorry, this has been kicking around in my head for a while, so it’ll be a long post.
First off, I don’t think that “single compiler” or “multiple compiler” is quite the most useful way to think about the issue. Rather, I’d want to think in terms of “implementation-defined languages” vs “standardized languages”. In other words, if there’s a piece of paper describing the language and it disagrees with the compiler, which one is buggy?
Haskell started off as clearly being a standardized language. It arose from a milieu in which there were a variety of lazy functional languages, all of which had quite similar capabilities, but none of which were compatible. The lack of compatibility led to all kinds of annoying work porting libraries from one to the next. The initial push was to find a kind of common core language that all of these projects could implement, but as committees full of researchers are wont to do, they ended up innovating nonetheless, leading to things like monadic IO and type classes.
The Haskell language report was never fully formal the way that something like The Definition of Standard ML is. Nonetheless, even if it wasn’t specific enough to mathematically prove that a compiler had a bug, reasonable people could read it and use it.
A related discussion is the tension between prescriptive and descriptive approaches to human language. While there is no official standard for English, there are a number of respected bodies that have standardized particular versions of the language. Other languages indeed have defining bodies, like Dansk Sprognævn for Danish. These standardized languages exist in tension with the community of speakers - the standard mostly codifies linguistic changes that have already become popular in the community, but the standard is also used to do things like red underlines from spellcheckers and friendly notes about grammar from co-authors. The feedback mechanisms are complex and tied up with the distribution of power in the community of speakers. In reality, this is the relationship between most standard descriptions of programming languages and their implementations as well - innovations begin in implementations and then flow back to the standards, and powerful implementations have an easier time getting things into the standard (see, for instance, EME in HTML5).
This doesn’t happen in Haskell anymore. I’m not taking a position here either way on whether it should, just pointing out that it doesn’t. Today, Haskell 2010 does not describe any usable implementation, and divergence form Haskell 2010 is not considered to be a bug. For instance, Haskell 2010 indicates that fail
is a part of Monad
, and does not require a Functor
or Applicative
superclass.
One important difference between implementation-defined languages, in which there is a single canonical implementation that serves as a spec for alternative implementations (e.g. Racket or Rust) and standardized languages (e.g. Scheme or C++) is that the relationship between the compiler and the tooling is different. For Racket or Rust, tooling should generally support racket or rustc, and other compilers should additionally present similar pragmatics if they’d like to integrate into the tool ecosystem. For Scheme or C++, tools like geiser or cmake treat the compiler as a kind of pluggable component that may require some portability shims but ideally won’t. These days, implementation-defined languages are frequently treating the language, the build system, the documentation tools, and other important parts of the system as an integrated whole to be developed in concert with a view towards giving the user an easy time of things.
Standardized languages, by paying the cost of worse integration, gain many benefits. Multiple implementations means that each can specialize (e.g. interactive environment vs compiler speed vs generated code speed vs platform support vs static analysis). It also means that no one implementor controls the language, so it helps maintain an alignment of interests between users and implementors, because the implementors can’t “go rogue” as easily. It also allows a more deliberative approach to language evolution, because competing interests have a way of sharpening arguments and ideas.
I think that, in Haskell, we have the social dynamics of a standardized language. Because other implementations are possible (at least in our minds), we maintain a notional separation between GHC, GHCUp, Cabal, Stack, Haddock, HLS, etc. There is no “Haskell project” the way there is a Rust project or a Racket project, but rather there’s a variety of projects making useful Haskell things. The upside of this is that we can integrate new compilers, which can be a wonderful thing (e.g. the SML and Common Lisp worlds get lots of value from this, and it used to be common in Haskell to use Hugs for interactive development and GHC for building fast binaries). The downside of this is that it becomes harder to address cross-cutting concerns, we end up with more brittle integrations, and we risk more community splits. And social dynamics tend to be reflected in software architecture as well.
On the other hand, I also think it’s highly unlikely that another useful Haskell compiler will come into existence today (barring a couple of specialized use cases like Helium). A big part of the value of a programming language comes from network effects, and being compatible with all the great things on Hackage will require GHC-compatible implementations of things like Template Haskell, generics, GADTs, higher-rank polymorphism, etc. I think it’s much more likely that the future of Haskell compiler development lies in improvements to GHC. The benefits of a standardized language are a bit moot if only one implementation seems realistic.
This is just my thoughts about our situation - no specific recommendation of a path forward is in this post! That’s up to the various Haskell projects out there to think about and coordinate on.
TL;DR:
- Haskell is formally a standards-defined language, but is now an implementation-defined language in all but name (there is no maintained implementation of the most recent standard).
- Our social organizations and project structuring are based on the assumptions of a standards-defined language, which has costs and benefits. We pay the costs but don’t get most of the benefits anymore.