Call for Sponsors: First-class Liquid Haskell

Dear all,

The Liquid Haskell development team and Tweag are looking for funding to make Liquid Haskell a first-class tool for Haskell development. If you or your organization are impacted by any of the issues presented here, or would like to foster adoption of Liquid Haskell, please, consider funding this effort. Reach out to facundo.dominguez@tweag.io (@facundodomiguez on GitHub) to get involved.

Motivation

The larger a software system is, the harder it is to find bugs in it through testing, and the harder it is to change it without breaking it, which is crucial for long-lived mission-critical systems. By far, the sharpest tool available to support this goal is static type systems. As many have reported [1, 2, 3, 4], static type systems support quite strong guarantees, and in particular, they support fearless refactoring.

Now, Haskell projects that rely on the type system often face a trade-off between the properties that they check at compile time and the difficulty of reading and modifying the code, resulting in many verifiable properties being checked only by testing. Liquid Haskell is a static verification tool that resolves this tension, first by allowing properties to be expressed with little modification to the Haskell program, and second, by offloading much of the verification effort to automated theorem provers (more specifically, SMT solvers). The net effect is that Liquid Haskell makes it easier to check more properties at compile time while keeping code readability maximal.

Liquid Haskell can help speed up parsing in network libraries, it can enforce data access policies in web applications, check the safe use of pointer arithmetic, check data structure invariants, simplify the developer experience when embedding other programming languages in Haskell, and more.

Liquid Haskell has already been used in industry, but we seek to make it easier to adopt. By eliminating the engineering obstacles that make Liquid Haskell hard to use today, Liquid Haskell could become a compelling new reason to use Haskell. If Haskell is good at refactoring large code bases, Liquid Haskell would boost it even farther away on this axis. Imagine a production language that natively supports a gentle slope to increasing levels of static verification. Therefore, the focus of this proposal is on smooth integration with the Haskell compiler (GHC) and the Haskell ecosystem.

Roadmap

The immediate goal is to make Liquid Haskell compatible with the latest GHC 9.x releases, and to make future upgrades cheaper. Currently, if a project uses an unsupported version of GHC, it can’t use Liquid Haskell. Moreover, even if a project can use Liquid Haskell, it needs to assume the cost of migrating it later to a newer version of the compiler.

We then address the usability issues that affect specific cases that we have encountered in our experience, and which would prevent users from making effective use of Liquid Haskell in these cases.

Task Group 1 - Streamline upgrades

Task Group 2 - Better performance, better integration

Task Group 3 - Better user experience

Task Group 4 - Better proof automation

Summary

Call for sponsors

Task Group 1 - Streamline upgrades

Upgrade Liquid Haskell to work with GHC 9.x releases

Liquid Haskell is implemented as a GHC plugin. Hence, when GHC compiles a module, if the plugin is enabled, Liquid Haskell will receive the specifications and the code in the module, and it will try to verify it. This allows Liquid Haskell to reuse much code from GHC, but it also means that Liquid Haskell needs to be made aware of the internal differences between different versions of GHC.

At the moment, Liquid Haskell only supports GHC 9.2.

Specific tasks

We will shorten the upgrade time by better isolating the code that Liquid Haskell reuses from GHC. We estimate that upgrade time can be halved. After this is done, we will update Liquid Haskell to support GHC 9.4 and then GHC 9.6. Full details are discussed in the corresponding github issue.

Deliverables

There will be new releases of Liquid Haskell for GHC 9.4, GHC 9.6, and a GitHub workflow to test Liquid Haskell against pre-releases.

Timeline
Task Complexity Estimate (Weeks)
Isolate the uses of the GHC API Low 2
Upgrade to GHC 9.4 Medium 1
Upgrade to GHC 9.6 Medium 1
Setup GitHub workflow to test against GHC pre-releases Low 0.5
TOTAL 4.5

Drop non-essential dependencies of Liquid Haskell

Liquid Haskell depends on a few packages that have large sets of transitive dependencies that may lag behind the latest GHC. If Liquid Haskell is to move swiftly to new versions of the compiler, these dependencies need to be updated to build with the latest GHC version as well. The fewer dependencies we have, the less work we have to do.

Specific tasks

We will reduce the dependencies of Liquid Haskell. We envision 3 ways of achieving this reduction:

  1. Some dependencies can be replaced with simpler ones. For instance, aeson could be replaced with json.
  2. Some dependencies can be dropped at the expense of changing the code style a bit (e.g. dropping optics). And finally,
  3. Some dependencies can be put behind a feature flag, so the dependency is necessary only if the user asks for a specific feature (e.g. using store to save internal dumps of LH instead of a slower alternative).

Deliverables

A new release of Liquid Haskell with a reduced set of transitive dependencies.

Timeline
Task Complexity Estimate (Weeks)
Replace, drop, and disable LH dependencies Low 2
TOTAL 2

Task Group 2 - Better performance, better integration

Identify and disable expensive checks

The time that Liquid Haskell employs in verification should solely depend on the quantity of specifications that it contains, alongside those from its transitively imported modules. Enabling Liquid Haskell on a project without specifications should, in principle, not require any time overhead.

We have found, however, that enabling Liquid Haskell in some modules without any specifications takes significantly long.

Specific tasks

We will profile the time spent by Liquid Haskell and identify the problematic characteristics of the modules. The aim is to identify the checks and algorithms that are consuming the most time, and propose ways to turn off these algorithms or tune their behavior to spend less upfront time.

Deliverables

A report of the discovered issues and a release of Liquid Haskell with the implemented optimizations.

Timeline
Task Complexity Estimate (Weeks)
Collect modules with long verification times and dive into some of them to diagnose performance problems Medium 2
TOTAL 2

Allow to verify modules when using -haddock

The -haddock flag of GHC has the effect of collecting haddock comments in interface files. The current behavior of Liquid haskell is to skip verification if the flag is enabled. In our experiments, however, reenabling verification in this case seems to work fine.

The reason to skip verification is that the -haddock flag is also used in scenarios where verification is affected, so in the past, the Liquid Haskell developers decided to skip verification for all cases. In particular, verification fails when cabal haddock is invoked.

Specific tasks

Diagnose the issue with cabal haddock further and either fix this issue or arrange to skip verification in a narrower amount of cases.

Deliverables

A fix and tests for Liquid Haskell.

Timeline
Task Complexity Estimate (Weeks)
Implement support for cabal haddock Medium 1
TOTAL 1

Fix crashes on higher-kinded polymorphism

At least the composition operator from Control.Category, which has a more general type than the standard composition from Prelude, causes Liquid Haskell to choke. Here is the issue.

Specific tasks

We will diagnose the type errors that it produces and implement a remediation.

Deliverables

A fix of the problem plus regression tests.

Timeline
Task Complexity Estimate (Weeks)
Fix and test crashes on higher-kinded polymorphism Medium 1
TOTAL 1

Task Group 3 - Better user experience

Allow defining instance specifications away from their definitions

Sometimes it is necessary to reason with a property that is specific to a particular instance of a typeclass. One such example could be the “IsString Text” instance, which is used by the OverloadedStrings language extension to create Text values from literal strings.

We might want to say that fromString produces a Text value of the same length as the input string. In Liquid Haskell speak:

fromString :: xs:String -> { v:Text | tlength v == len xs }

However, this property is only sensible to express for the “IsString Text” instance, not for the IsString class which abstracts over the concrete type to which strings are converted.

Liquid Haskell has a mechanism to give specifications to the methods of typeclass instances. But unfortunately, these specifications seem to be ignored if they are not written in the same module where the corresponding type class instance is defined. This is a problem in practice, because this usage pattern is common. This test of Liquid Haskell shows the feature in action.

Specific tasks

We will implement the ability to add these specifications in downstream modules, like any other assumption and specification about dependencies. In the past, discovering how to expose other specifications away from their definitions has taken a couple of days. However, the typeclass support is rather new in Liquid Haskell, and there might be other unexpected complexities.

Deliverables

A fix and tests integrated in the Liquid Haskell GitHub repository.

Timeline
Task Complexity Estimate (Weeks)
Allow typeclass instance specifications away from their definitions Medium 1
TOTAL 1

Allow reflection of functions away from their definition

When Liquid Haskell verifies a module, sometimes it needs to unfold function calls that appear in specifications. This requires Liquid Haskell to know the defining equations of these functions.

At the moment, Liquid Haskell can be provided these equations by translating the function

from Haskell to the logic language with a reflection mechanism. Reflection, however, needs to happen at the definition site of the function. The definition cannot be reflected if it is in a dependency which is not verified with Liquid Haskell. The workaround is to fork dependencies and configure them for reflection. But this is a tedious overhead in a project with many dependencies. Imagine having to fork the base package as soon as we need to teach LH to unfold the list append (++) function. That would be impractical.

Specific tasks

We will extend Liquid Haskell with a mechanism that allows defining equations to be introduced into the logic away from the original definition site of the functions. Further discussion can be found in this ticket.

Deliverables

Implementation, documentation, and tests for reflecting functions away from their definitions.

Timeline
Task Complexity Estimate (Weeks)
Implement, tests, and documentation Medium 4
TOTAL 4

Make data type reflection automatic

When specifications refer to data constructors or record fields of Haskell Algebraic Data Types (ADTs), these types need to be reflected in the logic. Currently, Liquid Haskell requires datatype definitions to be copied into a special comment. This is tedious, error-prone, and could be done automatically.

The need to use ADTs in specifications arises whenever we want to describe values of the corresponding types. For instance, we might want to say that a list is not empty as xs /= [], or we might want to say that an expression is well-typed as inferType … e == Just ….

Specific tasks

Here is a ticket discussing the issue together with an initial attempt to solve it. We will save the user the trouble of copying these definitions into special comments at least in the cases where no changes are needed to the original definitions.

Deliverables

Implementation, documentation, and tests.

Timeline
Task Complexity Estimate (Weeks)
Implement, tests, and documentation Medium 3
TOTAL 3

Augment the environment of error messages

When Liquid Haskell reports errors, it reports part of the hypotheses in which a goal could not be proved. Although we did improve the information that Liquid Haskell provides, sometimes relevant hypotheses are still omitted in the report, which could help diagnose the failure.

Currently, one has to inspect file dumps in order to retrieve all hypotheses and understand why the goal could not be proved. This issue affects the interaction with Liquid Haskell at every proof obligation that needs to be validated, thus it is a case of a frequent activity that can be simplified.

Specific tasks

We will collect all of the relevant hypotheses in the error report without requiring the user to dive into file dumps.

Deliverables

Implementation, documentation, and tests for listing all relevant hypotheses in error messages.

Timeline
Task Complexity Estimate (Weeks)
Implement, test, and document Medium 2
TOTAL 2

Fix errors about seemingly redundant imports

Sometimes Liquid Haskell fails to verify a module if some seemingly redundant import declaration is missing. The error reported by Liquid Haskell does not yield insights on the cause, which most likely is the consequence of a bug in Liquid Haskell. Occurrences of this bug can

be found in the stitch-lh tests in the Liquid Haskell testsuite.

In a project with multiple modules and transitive imports, this issue makes difficult to guess

the needed imports when maintaining the code.

Specific tasks

This task is about either removing the need for the import declaration or improving the error message.

Deliverables

Diagnose and fix.

Timeline
Task Complexity Estimate (Weeks)
Diagnose and fix Medium 1
TOTAL 1

Task Group 4 - Better proof automation

Allow reflection of functions which call non-reflected functions

When Liquid Haskell verifies a module, sometimes it needs to unfold function calls that appear in specifications. This requires Liquid Haskell to know the defining equations of these functions.

Currently, Liquid Haskell can be provided these equations by translating the functions

from Haskell to the logic language with a reflection mechanism. Liquid Haskell, however, insists that all functions called by a reflected function are reflected as well. This restriction is too strong when the programmer is interested in verifying only part of a program, or when the called functions are considered primitive for the task at hand. Breaking up the verification effort is a crucial ability to control the scope of a project.

This need can arise, for instance, when we need to unfold a function that manipulates Maps from Data.Map. If we already have specifications for the operations of Maps, there would be little interest in having Liquid Haskell reflect their implementation.

Currently, the user can workaround this limitation by using a combination of assume and measure directives. But these directives weren’t designed for this use case, and the idiom comes with usability concerns that we wish to address. One of the difficulties is that the workaround needs to be applied as many times as unreflected functions are called by the function we want to reflect. Another difficulty is that it is possible to apply it wrong, and the many repetitions make it more likely.

Specific work

We will implement this proposal to reflect functions even when they call unreflected functions.

Deliverables

Implementation, documentation, and tests for reflecting functions away from their definitions.

Timeline
Task Complexity Estimate (Weeks)
Implement, test, and document Medium 4
TOTAL 4

Better proof obligations: use sub-formulas when unfolding definitions

Liquid Haskell can unfold function definitions when it builds proof obligations for the underlying theorem prover. It does so by replacing a function call with the right hand side of one of the defining equations for the function. However, when there is more than one such equation, extra information is needed in order to decide which equation to use. This information is inferred from the context in which the function call appears. At the moment, the context includes hypotheses, but it doesn’t incorporate the logical structure surrounding the function call in the formula where it appears. For example, say we need to prove that a call like null xs yields True, with

null :: [a] -> Bool
null [] = True
null (x:xs) = False

If Liquid Haskell is given a hypothesis that says that xs is the empty list [], then it will present the result of unfolding null xs using the first equation to the theorem prover. However, if we want to prove xs=[] => null xs with no hypothesis, Liquid Haskell won’t be able to unfold null because the antecedent xs=[] is not in the context that it uses, and then it doesn’t know which equation of null to use. This limitation can be overcome, as we can usually modify the program being verified to bring the hypothesis we need into the context, but it would be better if we would not have to modify the program at all.

Formulas of the form v => something can appear, for instance, when we want to say that a function returning True offers some guarantees about the input values.

isValid :: i0:Input0 -> i1:Input1 -> { v:Bool | v => isGood i0 }

But the validation is complex enough that we don’t want to characterize the result with an equality v == isGood i0 && isGoodToo i1.

Specific tasks

We will improve Liquid Haskell’s proof capacity, by making it use the information in sub-formulas - like the antecedent in the previous example - when deciding which equations to use when unfolding.

Deliverables

Implementation, documentation, and tests for discovering assumptions in subformulas.

Timeline
Task Complexity Estimate (Weeks)
Implement, test, and document Medium 3
TOTAL 3

Allow composing function specifications

Today, when we give a specification to a function, we have to choose which properties to assert in it. For instance, the following specification talks about the lengths of ByteStrings.

append
  :: a:ByteString
  -> b:ByteString
  -> { v:ByteString | bslen v == bslen a + bslen b }

But some applications of ByteString might require making considerations about the encoding instead. Perhaps,

append
  :: a:ByteString
  -> b:ByteString
  -> { v:ByteString | isUtf8 a && isUtf8 b ⇒ isUtf8 v }

We can combine the two

append
  :: a:ByteString
  -> b:ByteString
  -> { v:ByteString
     | bslen v == bslen a + bslen b && isUtf8 a && isUtf8 b ⇒ isUtf8 v
     }

but this has a couple of disadvantages. First, the specification might start becoming more involved than it needs to be for each particular use case. And second, we need to literally copy the pieces of the former specifications to create the new one.

Specific tasks

We will design and implement a mechanism to compose the specifications, and the composition will be possible to constrain only to the scenarios where it is needed.

Deliverables

Design, implementation, and tests of the composition of specifications.

Timeline
Task Complexity Estimate (Weeks)
Allow composing function specifications Medium 3
TOTAL 3

Summary

Task Group 1 - Streamline upgrades

Currently, Liquid Haskell releases lag behind GHC releases. This issue covers both updating Liquid Haskell to support the latest GHC versions and to shorten the time it takes to release Liquid Haskell after a new release of GHC.

Task Group 2 - Better performance, better integration

Executing Liquid Haskell should not affect performance in non-annotated modules and unexpected blockers should be eliminated.

Task Group 3 - Better user experience

To be used day long pleasantly, a number of improvements to Liquid Haskell must be made.

Task Group 4 - Better proof automation

Whenever possible, manual tweaks should be avoided. Proof automation should be maximized.

Call for sponsors

The maintainers of Liquid Haskell are a small team and the current Liquid Haskell version is the result of their hard work. To get Liquid Haskell usable in a variety of production environments, by Haskell programmers of all levels, some improvements are needed.

As the roadmap above exemplifies, the set of tasks is not overwhelming: with a concerted funding effort, we will be able to make Liquid Haskell a user-friendly powerful tool, augmenting our capacity to prove that Haskell programs meet the highest quality standards!

If you or your organization:

  • are impacted by these issues,
  • have issues related to Liquid Haskell usability or performance,
  • would like to prioritize or accelerate certain work,

Please, consider supporting this effort. Reach out to facundo.dominguez@tweag.io (@facundodomiguez on GitHub) to get involved.

35 Likes

Haskell Foundation ED here. I’ve been watching Liquid Haskell for years now. I think it has a ton of potential to bridge the gap between the kinds of lightweight invariants that Haskell’s type system is so good at making easy to check and the more involved ones used in full formal verification. It has not yet seen the kind of wide adoption that could make it a game-changer in either industry or open-source development, however. This project can remove barriers to adoption and keep the project healthy.

If you have an application where you think Liquid Haskell could bring value, but you’re not yet sure whether it can work for your team, I encourage you to get in touch with Facundo about helping fund the work necessary to tick off those last few checkboxes. I also suspect that if your needs are broadly in alignment with this program but differ in the details, he’d also be interested.

17 Likes

As functional programmers, we love to talk about how functional programming makes it easier to get our program correct, and easier to reason about our programs; but we don’t actually do much reasoning, at least not at scale in a machine-supported way.

That’s not quite true: static type systems are the world’s most widely used formal reasoning tool,
and Haskell’s type system is right up there on the bleeding edge. Theorems (called types) are part of the code; and are verified by the compiler whenever you compile your program.

But even Haskell’s type system has its limits. There are plenty of places in my program when I say “I know this list cannot be empty” or “I know this key will be this finite map, so I don’t need to worry about the Nothing case of the lookup”; but alas the type system is not expressive enough to guarantee these claims. GHC itself is littered with assertions like this, which are dynamically checked when we compile with -DDEBUG.

What I really want is a static verification tool that comes “in the box”

  • that integrates nicely with GHC and with HLS;
  • that handles the Haskell we use in practice (not just a toy subset); and
  • that allows me to state properties of my code that are more advanced than the type system can express, but which are no less foundational to the correct operation of my code.

If we had this, we would have a huge new “sales proposition” for Haskell: a production language with built-in support for formal verification, and a “gentle slope” so that you can move from static types up to Liquid Haskell types without a sharp discontinuity in either concepts or in tooling.

In this space, for Haskell at least, Liquid Haskell is the only game in town.

  • It aspires to all of these goals
  • It is the result of a ten-year research program
  • It already works, and it works on full Haskell because it uses GHC as its front end; and
  • From a technical point of view it is absolutely up to the job.

But making all this work in practice, at scale on real Haskell code, with smooth tooling, is a massive engineering effort. Tackling part of this engineering backlog is what this proposal is all about, and I support it strongly. If we can get to the point where LH is genuinely usable, we should get into a virtuous spiral in which it is rewarding to contribute becuase it makes a good thing better.

Go Liquid Haskell!

32 Likes

While working on GHC recently, I too noticed there are a lot of functions that have hidden preconditions (especially in code generation). I would love to see Liquid Haskell being used to make those explicit.

10 Likes

Not much substantive to add here, other than to echo: Go Liquid Haskell!

Type systems have succeeded as a way to reason about code because they strike a great balance between expressive power and ease-of-use. I think LH-style refinement types meet this same high bar: they are wonderfully expressive and, supported by a powerful solver, easy to use. Let’s show the world that it’s true!

11 Likes

Since they’re comments, I wonder if you could add properties directly into base at little cost. It’ll a lot of head woes or division by zero issues while technically not breaking most libraries dependent on base.

1 Like

That’s a good question. LH does check that the input to head is non-empty and division is not by zero without having to modify base. Packages in general can have specifications written in separate packages. The current behavior is tersely explained here and more context is provided in this post.

4 Likes
3 Likes