Formally verify Haskell code

I have read that one of the advantages of Haskell is that it makes it easier to formally verify code. Also, part of the standard library seems to have been formally verified.
Suppose I wrote a mergesort in Haskell. How can I formally verify that such functions actually do what I expect them to do? Also, do you have any references to how the formal verification was done in the standard library?
I would like to write a library (even a very simple one) and verify it formally.

6 Likes

LiquidHaskell is a good starting point but there is also agda2hs.

5 Likes

There’s also the good old manual formal verification. The Haskell Unfolder episode about laws has some examples of it around 7:40 and 17:30

4 Likes

I am very interested in how others think about this too, since I say that to people a lot… But adimittedly only with hand-waving arguments such as good property based testing with QuickCheck and pen-and-paper proofs.

Liquid Haskell seems a good one, but I wonder how practical it is for large projects at the moment?

The most practical way (for a quite, well, ‘special’ definition of “practical”) is to use Coq to generate Haskell code from your formal proofs.
The site doesn’t work right now, but it should be this: https://coq.inria.fr/refman/addendum/extraction.html

What is meant by “easy to formally verify” is that pure (side effect free) Haskell functions are easier to verify.

If you really want to start learning how to use proof assistants, the best way is AFAIK still using software foundations Logical Foundations. Videos: Welcome to Software Foundations in Coq - YouTube

3 Likes

Searching online, one notices that in the field of formal verification, the tools you suggest are not often used. Why?
Instead, the ada language is widely used in applications where formal verification is important. Why?

I’d say most formal verification material online is focused on deductive program verification, since the standard in industry are imperative, OOP paradigms, you need different approaches to formal verify those kinds of programs. The answers you got in this thread were focused for inductive program verification which is significantly different.

3 Likes

This is because one has to distinguish two very different fields of verification:

  • In the first area of formal verification, an overflowing integer (to take an example) can cause massive damage, kill people or lead to massive financial damages. Think for example about the controls in an airplaine or car, a digital train traffic control system, low-level cryptographic primitives, the verification of the arithmetic unit of a processor etc.
  • In the second area of formal verification we are interested in the correctness of algorithms in a more high-level sense. Suppose you have come up with a new algorithm on graphs which computes shortest paths, and now want to show that this algorithm works correctly. But people won’t die because you accidentally blow the stack limit if your algorithm doesn’t use constant space.

The big difference between the two areas is that for the first area it is essential that you verify the entire toolchain, compilers and runtime systems included. For this reason, the only languages which work in that field are very low-level languages: Assembler, C and the Spark subset of Ada. You cannot use a language like Haskell here, because in order to guarantee what those scenarios need you would also need to first verify the entire GHC compiler and the GHC runtime system, and this is currently impossible and would also directly go against most of the other goals of Haskell. (You would have to throw out almost all optimizations, throw out almost everything which is not Haskell98, massively simplify the RTS and throw out all performance optimizations there too, etc.)
In contrast, for a language like C there is a verified optimizing compiler (called compcert) which guarantees that it does not introduce bugs during compilation. So what you do in that field is that you write your code in C (or Ada), annotate it with massive amounts of pre- and post-conditions for every function, generate proof obligations from those annotations, and then solve those proof obligations using either automatic tools or proof assistants like Coq or Isabelle. Afterwards you can compile the now verified C code with compcert and have a program that is guaranteed to not have integer overflows and is for example guaranteed to run in constant space.

The second kind of formal verification is what people writing code in Idris, Dependent Haskell, Liquid Haskell, Lean etc. are interested in. And the argument you posted, that it should be easier to verify code in a functional language, is meant to apply to only this area of verification.
The argument is that it is easier to verify purely functional code because you can use equational reasoning and inductive proofs much more easily. My impression is that the difference between “verifying imperative programs” and “verifying functional programs” misses an important difference: The problem is not imperative code per se, but specifically pointers and aliasing. It is equally difficult to prove that a while-loop is correct and to prove that a recursive function is correct. In both cases you have to identify an invariant, called the “loop-invariant” and the “induction formula” respectively. Finding this invariant is the essential “art” in doing formal verification which cannot be done completely automatically. What really differentiates verifying “imperative” code from “functional” code is when pointers come into play, and you have to reason about whether two pointers can alias or not. (Formally, this is the difference between using just Hoare Logic and using Separation Logic). Since purely functional code doesn’t have this problem, you have eliminated one of the essential difficulties.
This also explains why some Rust programmers consider Rust to be halfway to purely functional programming: Rust’s rules about pointers guarantee that if you have two mutable references, then they cannot alias. This makes reasoning much, much simpler.

An interesting question is whether we will be able to use high-level languages for formal verification in the first sense in the future. I think something like https://cakeml.org is very interesting. CakeML is a compiler for StandardML which is formally verified, so you can write high-level functional code and also have strong guarantees about the compilation to machine code. But projects like this are still in the academia stage, so you would have to wait some time for those to become practical in the real world.

18 Likes

another way to generate verified Haskell: GitHub - agda/agda2hs: Compiling Agda code to readable Haskell

2 Likes

What are the main differences between liquidHaskell and tools such as coq, agda, idris, etc.?
What can and cannot be done with these tools?
Why should one of these be preferred over the others?
Now I am starting to learn liquidHaskell, what will be the limitations I might encounter compared to other tools? What about Liquidhaskell’s strengths?

If I may also also piggybacking a liquid haskell question here:

If I am creating a eDSL in Haskell, is it a reasonable expectation that I could leverage liquid haskell to offer formal verification for the applications building on the eDSLs?

The main difference between full dependent types (coq, agda, idris) and Liquid Haskell is that Liquid Haskell uses an automatic theorem prover, so you don’t have to write proofs yourself. It has some limitations in which kinds of things it can prove, but I think it suffices for most programming-related properties. Their website has some great examples on the front page:

https://ucsd-progsys.github.io/liquidhaskell

One disadvantage of Liquid Haskell is that it can be hard to tell what went wrong when it fails to prove something. With dependent types you can much more easily step through the proof and see what step goes wrong. I think the error messages are one area where Liquid Haskell still has lots of room to improve.

3 Likes

Can you give me an example?

In particular I was referring to the fact that Liquid Haskell is limited to quantifier-free predicates. Niki Vazou writes the following in her thesis Liquid Haskell: Haskell as a Theorem Prover:

Unfortunately, the automatic verification offered by refinements has come at a price. To ensure decidable checking with SMT solvers, the refinements are quantifier-free predicates drawn from a decidable logic. This significantly limits expressiveness by precluding specifications that enable abstraction over the refinements (i.e. invariants). For example, consider the following higher-order for-loop where set i x v returns the vector v updated at index i with the value x.

for :: Int → Int → a → (Int → a → a) → a
for lo hi x body = loop lo x where
  loop i x
    | i < hi = loop (i+1) (body i x)
    | otherwise = x

initUpto :: Vec a → a → Int → Vec a
initUpto a x n = for 0 n a (\i → set i x)

We would like to verify that initUpto returns a vector whose first n elements are equal to x. In a first-order setting, we could achieve the above with a loop invariant that asserted that at the ith iteration, the first i elements of the vector were already initalized to x. However, in our higher-order setting we require a means of abstracting over possible invariants, each of which can depend on the iteration index i. Higher-order logics like Coq and Agda permit such quantification over invariants. Alas, validity in such logics is well outside the realm of decidability, precluding automatic verification.

But the thesis does go on to address this problem using abstract refinements. I don’t know exactly how well that works. But there seems to be documentation about it here: Abstracting Over Refinements - LiquidHaskell Docs

7 Likes

In a nutshell, liquid haskell converts a logic expression in something a SMT Solver can read and uses that to try and check if the expression holds. So it isn’t quite a theorem prover like Agda, Coq which use other approaches, via MLTT or CoC. So the limitations of liquid haskell have to do with limitations of SMT solvers

3 Likes

Suppose I wrote a mergesort in Haskell. How can I formally verify that such functions actually do what I expect them to do?

For verifying the correctness of, say, an implementation of mergesort, there are two ways of doing it:

  • With pen & paper. This is quite feasible, Graham Hutton’s book gives a nice introduction. However, the drawback is that the resulting proof is not checked by a computer.
  • With a proof assistant. Various options exist, but it’s still at the edge of the possible. I am beginning to think that an alternate history, where people had implemented the “Glasgow Proof Assistant for Haskell” in lockstep with the “Glasglow Haskell Compiler” would have been awesome. Sadly, that didn’t happen.

I’m currently using agda2hs to verify parts of a blockchain wallet:

GitHub - cardano-foundation/cardano-wallet-agda: Formally verified wallet implementation

I fully intend this code to run in production.

I see agda2hs as “The Missing Proof Assistant for Haskell”. One reason why the alternate history didn’t happen is most likely that it’s not easy to implement a proof assistant for Haskell — it will start to look like adga2hs very soon (but maybe there are some shortcuts…).

2 Likes

I’m surprised nobody has mentioned sbv yet. It’s a library that speaks smtlib and you can use it to write almost 1:1 Haskell code * and verify it in a few ways (bounded model checking, weakest preconditions etc.). sbv: SMT Based Verification: Symbolic Haskell theorem prover using SMT solving.

Incidentally, it does already contain a symbolic implementation of mergesort: Documentation.SBV.Examples.BitPrecise.MergeSort

2 Likes