[ANN] E-graphs and equality saturation: hegg 0.1

I’m happy to announce the first release of the library hegg-0.1 on hackage: hegg: Fast equality saturation in Haskell !

hegg stands for haskell e-graphs good, and it’s a library featuring e-graphs (a data structure that efficiently represents a congruence relation over many expressions) and a high level API for equality saturation (an optimization/term-rewriting technique)

It is a pure haskell implementation found on two recent awesome papers: egg: Fast and Extensible Equality Saturation (2020) and Relational E-matching (2021).

I’d love to see what you might come up with leveraging e-graphs and equality saturation (there’s a world of applications – check out the papers and other resources), so do let me know about your efforts and how the library could be improved.

I’ve spent a long while optimizing and making the e-graphs and equality saturation considerably performant. So we are already pretty fast! (according to my symbolic rewriting tests). I’d love to see more involved use cases, because I still see many opportunities for improving performance, and would love to take them when justified by bigger numbers and applications!

I’ll be happy to clarify anything,

Thank you,

Rodrigo

11 Likes

Super cool! The immediate application I’d be interested in is a symbolic maths library like sympy. Do you think that’s a sensible application? (I saw the example of something in this vein, but I’m imagining a more complex system)

2 Likes

Yes, absolutely!

I’ve been entertaining that exact idea for some time now, but have refrained myself from picking it up because of lack of time.

There’s a similar project in Julia (https://github.com/JuliaSymbolics/Metatheory.jl) that leverages e-graphs for a symbolic rewriting library.

I think sympy doesn’t use e-graphs under the hood, but rather a classical rewriting backend, probably with quite specific optimizations.

My attempts so far at it have been very successful modulo user-facing/api considerations. The projects testsuite is in fact a symbolic maths library and without careful considerations (e.g. a very simplistic cost function) it already does surprisingly well – to the point that there’s one simplification it does that (given the rewrite rules I gave it) I can’t even understand[1]!

In summary: Yes! I believe we can leverage e-graphs to have a great symbolic library and would gladly contribute and help anyone who might want to pick that up

[1]: The expression ∫ ln(x) dx is simplified to x*(ln x + 1), given these rewrite rules

1 Like

Are there benchmarks comparing hegg and egg? IIRC egg was written to be very performant. Do you think hegg’s preformance will be similar?

2 Likes

Awesome! In theory I might want to pick that up, although not sure exactly when.

What I’d really love is if the rewrite rules could be more or less exactly the laws of the algebraic objects in question (e.g. simplify \math{x^{-1}x} using the group inverse law, etc), and that OverloadedStrings works well, so that I can write e.g. x +1 -1 + x :: SymbolicExpression and have that simplify to 2*x just using the field rules. I’ll check out the testsuite and examples to learn more

2 Likes

This is a good question.

I haven’t performed rigorous benchmarks comparing egg the high performance rust implementation with hegg, though that would be interesting data.

For the current symbolic math test suit, which considerably overlaps with eggs one, I do have some numbers.

hegg is not as performant as egg at the moment, which is to be expected due to rust’s finer control over memory and mutable data structures while hegg makes solely use of immutable data structures. There is still, however, lots of room for improvement on our end, e.g all the unexplored parallelism. Lastly, egg’s e-matching is still based on an abstract virtual machine while hegg has been implemented from the ground up following relational e-matching.

That said, the most of the test suite’s simpler individual tests take under 0.01 seconds in both. A more comparable example is [1] in my comment above — while egg takes 0.02 seconds, we take 0.10.

TL;DR egg is faster, something like 5x so in a non-trivial example; however, I really believe there’s still unexplored performance improvements with which we could get much closer to egg

Edit: do note that this is the first release, provided there’s interest and examples pushing performance issues further there’s also more motivation on my side to pursue further speedups; and on the other hand that egg is a well established project with over 500 commits :slight_smile:

2 Likes

[egg author here]

Very nice! You even beat us to merging relational e-matching :slight_smile: Re: performance, our current understanding is that “classic” e-matching will be tough to beat for small, simple patterns (which are most patterns!), but I think with some work you can get relational e-matching to parity there.

Feel free to join the e-graphs zulip to discuss!

3 Likes