Dear all,
The Liquid Haskell development team and Tweag are looking for funding to make Liquid Haskell a first-class tool for Haskell development. If you or your organization are impacted by any of the issues presented here, or would like to foster adoption of Liquid Haskell, please, consider funding this effort. Reach out to facundo.dominguez@tweag.io (@facundodomiguez on GitHub) to get involved.
Motivation
The larger a software system is, the harder it is to find bugs in it through testing, and the harder it is to change it without breaking it, which is crucial for long-lived mission-critical systems. By far, the sharpest tool available to support this goal is static type systems. As many have reported [1, 2, 3, 4], static type systems support quite strong guarantees, and in particular, they support fearless refactoring.
Now, Haskell projects that rely on the type system often face a trade-off between the properties that they check at compile time and the difficulty of reading and modifying the code, resulting in many verifiable properties being checked only by testing. Liquid Haskell is a static verification tool that resolves this tension, first by allowing properties to be expressed with little modification to the Haskell program, and second, by offloading much of the verification effort to automated theorem provers (more specifically, SMT solvers). The net effect is that Liquid Haskell makes it easier to check more properties at compile time while keeping code readability maximal.
Liquid Haskell can help speed up parsing in network libraries, it can enforce data access policies in web applications, check the safe use of pointer arithmetic, check data structure invariants, simplify the developer experience when embedding other programming languages in Haskell, and more.
Liquid Haskell has already been used in industry, but we seek to make it easier to adopt. By eliminating the engineering obstacles that make Liquid Haskell hard to use today, Liquid Haskell could become a compelling new reason to use Haskell. If Haskell is good at refactoring large code bases, Liquid Haskell would boost it even farther away on this axis. Imagine a production language that natively supports a gentle slope to increasing levels of static verification. Therefore, the focus of this proposal is on smooth integration with the Haskell compiler (GHC) and the Haskell ecosystem.
Roadmap
The immediate goal is to make Liquid Haskell compatible with the latest GHC 9.x releases, and to make future upgrades cheaper. Currently, if a project uses an unsupported version of GHC, it can’t use Liquid Haskell. Moreover, even if a project can use Liquid Haskell, it needs to assume the cost of migrating it later to a newer version of the compiler.
We then address the usability issues that affect specific cases that we have encountered in our experience, and which would prevent users from making effective use of Liquid Haskell in these cases.
Task Group 1 - Streamline upgrades
- Upgrade Liquid Haskell to work with GHC 9.x releases
- Drop non-essential dependencies of Liquid Haskell
Task Group 2 - Better performance, better integration
- Identify and disable expensive checks
- Allow to verify modules when using -haddock
- Fix crashes on higher-kinded polymorphism
Task Group 3 - Better user experience
- Allow defining instance specifications away from their definitions
- Allow reflection of functions away from their definition
- Make data type reflection automatic
- Augment the environment of error messages
- Fix errors about seemingly redundant imports
Task Group 4 - Better proof automation
- Allow reflection of functions which call non-reflected functions
- Better proof obligations: use sub-formulas when unfolding definitions
- Allow composing function specifications
- Task Group 1 - Streamline upgrades
- Task Group 2 - Better performance, better integration
- Task Group 3 - Better user experience
- Task Group 4 - Better proof automation
Task Group 1 - Streamline upgrades
Upgrade Liquid Haskell to work with GHC 9.x releases
Liquid Haskell is implemented as a GHC plugin. Hence, when GHC compiles a module, if the plugin is enabled, Liquid Haskell will receive the specifications and the code in the module, and it will try to verify it. This allows Liquid Haskell to reuse much code from GHC, but it also means that Liquid Haskell needs to be made aware of the internal differences between different versions of GHC.
At the moment, Liquid Haskell only supports GHC 9.2.
Specific tasks
We will shorten the upgrade time by better isolating the code that Liquid Haskell reuses from GHC. We estimate that upgrade time can be halved. After this is done, we will update Liquid Haskell to support GHC 9.4 and then GHC 9.6. Full details are discussed in the corresponding github issue.
Deliverables
There will be new releases of Liquid Haskell for GHC 9.4, GHC 9.6, and a GitHub workflow to test Liquid Haskell against pre-releases.
Timeline
Task | Complexity | Estimate (Weeks) |
---|---|---|
Isolate the uses of the GHC API | Low | 2 |
Upgrade to GHC 9.4 | Medium | 1 |
Upgrade to GHC 9.6 | Medium | 1 |
Setup GitHub workflow to test against GHC pre-releases | Low | 0.5 |
TOTAL | 4.5 |
Drop non-essential dependencies of Liquid Haskell
Liquid Haskell depends on a few packages that have large sets of transitive dependencies that may lag behind the latest GHC. If Liquid Haskell is to move swiftly to new versions of the compiler, these dependencies need to be updated to build with the latest GHC version as well. The fewer dependencies we have, the less work we have to do.
Specific tasks
We will reduce the dependencies of Liquid Haskell. We envision 3 ways of achieving this reduction:
- Some dependencies can be replaced with simpler ones. For instance, aeson could be replaced with json.
- Some dependencies can be dropped at the expense of changing the code style a bit (e.g. dropping optics). And finally,
- Some dependencies can be put behind a feature flag, so the dependency is necessary only if the user asks for a specific feature (e.g. using store to save internal dumps of LH instead of a slower alternative).
Deliverables
A new release of Liquid Haskell with a reduced set of transitive dependencies.
Timeline
Task | Complexity | Estimate (Weeks) |
---|---|---|
Replace, drop, and disable LH dependencies | Low | 2 |
TOTAL | 2 |
Task Group 2 - Better performance, better integration
Identify and disable expensive checks
The time that Liquid Haskell employs in verification should solely depend on the quantity of specifications that it contains, alongside those from its transitively imported modules. Enabling Liquid Haskell on a project without specifications should, in principle, not require any time overhead.
We have found, however, that enabling Liquid Haskell in some modules without any specifications takes significantly long.
Specific tasks
We will profile the time spent by Liquid Haskell and identify the problematic characteristics of the modules. The aim is to identify the checks and algorithms that are consuming the most time, and propose ways to turn off these algorithms or tune their behavior to spend less upfront time.
Deliverables
A report of the discovered issues and a release of Liquid Haskell with the implemented optimizations.
Timeline
Task | Complexity | Estimate (Weeks) |
---|---|---|
Collect modules with long verification times and dive into some of them to diagnose performance problems | Medium | 2 |
TOTAL | 2 |
Allow to verify modules when using -haddock
The -haddock flag of GHC has the effect of collecting haddock comments in interface files. The current behavior of Liquid haskell is to skip verification if the flag is enabled. In our experiments, however, reenabling verification in this case seems to work fine.
The reason to skip verification is that the -haddock flag is also used in scenarios where verification is affected, so in the past, the Liquid Haskell developers decided to skip verification for all cases. In particular, verification fails when cabal haddock is invoked.
Specific tasks
Diagnose the issue with cabal haddock further and either fix this issue or arrange to skip verification in a narrower amount of cases.
Deliverables
A fix and tests for Liquid Haskell.
Timeline
Task | Complexity | Estimate (Weeks) |
---|---|---|
Implement support for cabal haddock | Medium | 1 |
TOTAL | 1 |
Fix crashes on higher-kinded polymorphism
At least the composition operator from Control.Category, which has a more general type than the standard composition from Prelude, causes Liquid Haskell to choke. Here is the issue.
Specific tasks
We will diagnose the type errors that it produces and implement a remediation.
Deliverables
A fix of the problem plus regression tests.
Timeline
Task | Complexity | Estimate (Weeks) |
---|---|---|
Fix and test crashes on higher-kinded polymorphism | Medium | 1 |
TOTAL | 1 |
Task Group 3 - Better user experience
Allow defining instance specifications away from their definitions
Sometimes it is necessary to reason with a property that is specific to a particular instance of a typeclass. One such example could be the “IsString Text” instance, which is used by the OverloadedStrings language extension to create Text values from literal strings.
We might want to say that fromString produces a Text value of the same length as the input string. In Liquid Haskell speak:
fromString :: xs:String -> { v:Text | tlength v == len xs }
However, this property is only sensible to express for the “IsString Text” instance, not for the IsString class which abstracts over the concrete type to which strings are converted.
Liquid Haskell has a mechanism to give specifications to the methods of typeclass instances. But unfortunately, these specifications seem to be ignored if they are not written in the same module where the corresponding type class instance is defined. This is a problem in practice, because this usage pattern is common. This test of Liquid Haskell shows the feature in action.
Specific tasks
We will implement the ability to add these specifications in downstream modules, like any other assumption and specification about dependencies. In the past, discovering how to expose other specifications away from their definitions has taken a couple of days. However, the typeclass support is rather new in Liquid Haskell, and there might be other unexpected complexities.
Deliverables
A fix and tests integrated in the Liquid Haskell GitHub repository.
Timeline
Task | Complexity | Estimate (Weeks) |
---|---|---|
Allow typeclass instance specifications away from their definitions | Medium | 1 |
TOTAL | 1 |
Allow reflection of functions away from their definition
When Liquid Haskell verifies a module, sometimes it needs to unfold function calls that appear in specifications. This requires Liquid Haskell to know the defining equations of these functions.
At the moment, Liquid Haskell can be provided these equations by translating the function
from Haskell to the logic language with a reflection mechanism. Reflection, however, needs to happen at the definition site of the function. The definition cannot be reflected if it is in a dependency which is not verified with Liquid Haskell. The workaround is to fork dependencies and configure them for reflection. But this is a tedious overhead in a project with many dependencies. Imagine having to fork the base package as soon as we need to teach LH to unfold the list append (++) function. That would be impractical.
Specific tasks
We will extend Liquid Haskell with a mechanism that allows defining equations to be introduced into the logic away from the original definition site of the functions. Further discussion can be found in this ticket.
Deliverables
Implementation, documentation, and tests for reflecting functions away from their definitions.
Timeline
Task | Complexity | Estimate (Weeks) |
---|---|---|
Implement, tests, and documentation | Medium | 4 |
TOTAL | 4 |
Make data type reflection automatic
When specifications refer to data constructors or record fields of Haskell Algebraic Data Types (ADTs), these types need to be reflected in the logic. Currently, Liquid Haskell requires datatype definitions to be copied into a special comment. This is tedious, error-prone, and could be done automatically.
The need to use ADTs in specifications arises whenever we want to describe values of the corresponding types. For instance, we might want to say that a list is not empty as xs /= []
, or we might want to say that an expression is well-typed as inferType … e == Just …
.
Specific tasks
Here is a ticket discussing the issue together with an initial attempt to solve it. We will save the user the trouble of copying these definitions into special comments at least in the cases where no changes are needed to the original definitions.
Deliverables
Implementation, documentation, and tests.
Timeline
Task | Complexity | Estimate (Weeks) |
---|---|---|
Implement, tests, and documentation | Medium | 3 |
TOTAL | 3 |
Augment the environment of error messages
When Liquid Haskell reports errors, it reports part of the hypotheses in which a goal could not be proved. Although we did improve the information that Liquid Haskell provides, sometimes relevant hypotheses are still omitted in the report, which could help diagnose the failure.
Currently, one has to inspect file dumps in order to retrieve all hypotheses and understand why the goal could not be proved. This issue affects the interaction with Liquid Haskell at every proof obligation that needs to be validated, thus it is a case of a frequent activity that can be simplified.
Specific tasks
We will collect all of the relevant hypotheses in the error report without requiring the user to dive into file dumps.
Deliverables
Implementation, documentation, and tests for listing all relevant hypotheses in error messages.
Timeline
Task | Complexity | Estimate (Weeks) |
---|---|---|
Implement, test, and document | Medium | 2 |
TOTAL | 2 |
Fix errors about seemingly redundant imports
Sometimes Liquid Haskell fails to verify a module if some seemingly redundant import declaration is missing. The error reported by Liquid Haskell does not yield insights on the cause, which most likely is the consequence of a bug in Liquid Haskell. Occurrences of this bug can
be found in the stitch-lh tests in the Liquid Haskell testsuite.
In a project with multiple modules and transitive imports, this issue makes difficult to guess
the needed imports when maintaining the code.
Specific tasks
This task is about either removing the need for the import declaration or improving the error message.
Deliverables
Diagnose and fix.
Timeline
Task | Complexity | Estimate (Weeks) |
---|---|---|
Diagnose and fix | Medium | 1 |
TOTAL | 1 |
Task Group 4 - Better proof automation
Allow reflection of functions which call non-reflected functions
When Liquid Haskell verifies a module, sometimes it needs to unfold function calls that appear in specifications. This requires Liquid Haskell to know the defining equations of these functions.
Currently, Liquid Haskell can be provided these equations by translating the functions
from Haskell to the logic language with a reflection mechanism. Liquid Haskell, however, insists that all functions called by a reflected function are reflected as well. This restriction is too strong when the programmer is interested in verifying only part of a program, or when the called functions are considered primitive for the task at hand. Breaking up the verification effort is a crucial ability to control the scope of a project.
This need can arise, for instance, when we need to unfold a function that manipulates Maps from Data.Map. If we already have specifications for the operations of Maps, there would be little interest in having Liquid Haskell reflect their implementation.
Currently, the user can workaround this limitation by using a combination of assume and measure directives. But these directives weren’t designed for this use case, and the idiom comes with usability concerns that we wish to address. One of the difficulties is that the workaround needs to be applied as many times as unreflected functions are called by the function we want to reflect. Another difficulty is that it is possible to apply it wrong, and the many repetitions make it more likely.
Specific work
We will implement this proposal to reflect functions even when they call unreflected functions.
Deliverables
Implementation, documentation, and tests for reflecting functions away from their definitions.
Timeline
Task | Complexity | Estimate (Weeks) |
---|---|---|
Implement, test, and document | Medium | 4 |
TOTAL | 4 |
Better proof obligations: use sub-formulas when unfolding definitions
Liquid Haskell can unfold function definitions when it builds proof obligations for the underlying theorem prover. It does so by replacing a function call with the right hand side of one of the defining equations for the function. However, when there is more than one such equation, extra information is needed in order to decide which equation to use. This information is inferred from the context in which the function call appears. At the moment, the context includes hypotheses, but it doesn’t incorporate the logical structure surrounding the function call in the formula where it appears. For example, say we need to prove that a call like null xs yields True, with
null :: [a] -> Bool
null [] = True
null (x:xs) = False
If Liquid Haskell is given a hypothesis that says that xs is the empty list [], then it will present the result of unfolding null xs using the first equation to the theorem prover. However, if we want to prove xs=[] => null xs with no hypothesis, Liquid Haskell won’t be able to unfold null because the antecedent xs=[] is not in the context that it uses, and then it doesn’t know which equation of null to use. This limitation can be overcome, as we can usually modify the program being verified to bring the hypothesis we need into the context, but it would be better if we would not have to modify the program at all.
Formulas of the form v => something can appear, for instance, when we want to say that a function returning True offers some guarantees about the input values.
isValid :: i0:Input0 -> i1:Input1 -> { v:Bool | v => isGood i0 }
But the validation is complex enough that we don’t want to characterize the result with an equality v == isGood i0 && isGoodToo i1
.
Specific tasks
We will improve Liquid Haskell’s proof capacity, by making it use the information in sub-formulas - like the antecedent in the previous example - when deciding which equations to use when unfolding.
Deliverables
Implementation, documentation, and tests for discovering assumptions in subformulas.
Timeline
Task | Complexity | Estimate (Weeks) |
---|---|---|
Implement, test, and document | Medium | 3 |
TOTAL | 3 |
Allow composing function specifications
Today, when we give a specification to a function, we have to choose which properties to assert in it. For instance, the following specification talks about the lengths of ByteStrings.
append
:: a:ByteString
-> b:ByteString
-> { v:ByteString | bslen v == bslen a + bslen b }
But some applications of ByteString might require making considerations about the encoding instead. Perhaps,
append
:: a:ByteString
-> b:ByteString
-> { v:ByteString | isUtf8 a && isUtf8 b ⇒ isUtf8 v }
We can combine the two
append
:: a:ByteString
-> b:ByteString
-> { v:ByteString
| bslen v == bslen a + bslen b && isUtf8 a && isUtf8 b ⇒ isUtf8 v
}
but this has a couple of disadvantages. First, the specification might start becoming more involved than it needs to be for each particular use case. And second, we need to literally copy the pieces of the former specifications to create the new one.
Specific tasks
We will design and implement a mechanism to compose the specifications, and the composition will be possible to constrain only to the scenarios where it is needed.
Deliverables
Design, implementation, and tests of the composition of specifications.
Timeline
Task | Complexity | Estimate (Weeks) |
---|---|---|
Allow composing function specifications | Medium | 3 |
TOTAL | 3 |
Summary
Task Group 1 - Streamline upgrades
Currently, Liquid Haskell releases lag behind GHC releases. This issue covers both updating Liquid Haskell to support the latest GHC versions and to shorten the time it takes to release Liquid Haskell after a new release of GHC.
Task Area | Estimate (Weeks) |
---|---|
Upgrade Liquid Haskell to latest release of GHC | 4.5 |
Drop non-essential dependencies | 2 |
TOTAL | 6.5 |
Task Group 2 - Better performance, better integration
Executing Liquid Haskell should not affect performance in non-annotated modules and unexpected blockers should be eliminated.
Task Area | Estimate (Weeks) |
---|---|
Identify and disable expensive checks in Liquid Haskell | 2 |
Allow to verify modules when using -haddock | 1 |
Fix crashes on higher-kinded polymorphism | 1 |
TOTAL | 4 |
Task Group 3 - Better user experience
To be used day long pleasantly, a number of improvements to Liquid Haskell must be made.
Task Group 4 - Better proof automation
Whenever possible, manual tweaks should be avoided. Proof automation should be maximized.
Task Area | Estimate (Weeks) |
---|---|
Allow reflection of functions which call non-reflected functions | 4 |
Better proof obligations: use sub-formulas when unfolding definitions | 3 |
Allow composing function specifications | 3 |
TOTAL | 10 |
Call for sponsors
The maintainers of Liquid Haskell are a small team and the current Liquid Haskell version is the result of their hard work. To get Liquid Haskell usable in a variety of production environments, by Haskell programmers of all levels, some improvements are needed.
As the roadmap above exemplifies, the set of tasks is not overwhelming: with a concerted funding effort, we will be able to make Liquid Haskell a user-friendly powerful tool, augmenting our capacity to prove that Haskell programs meet the highest quality standards!
If you or your organization:
- are impacted by these issues,
- have issues related to Liquid Haskell usability or performance,
- would like to prioritize or accelerate certain work,
Please, consider supporting this effort. Reach out to facundo.dominguez@tweag.io (@facundodomiguez on GitHub) to get involved.