I’m very excited to announce the first official release of the FlatCV Haskell bindings!
This has been a very long journey that started 9 years ago with a simple question:
Can I build a computer vision app that extracts documents from photos?
My first iteration was a CLI app called Perspectra, implemented with Python and scikit-image. However, I quickly realized that I absolutely do not like Python and that I needed a GUI to fix incorrectly detected document boundaries, as the CV pipeline would never get it a 100% right.
And how do you build a desktop app with a GUI? Obviously with Haskell
.
(The “desktop app” part is a whole other story I’ll save for another post, but I am making good progress on that front as well. => Perspec. I’ll focus on the CV part for the rest of this post.)
As I didn’t want to use Python any longer, my next instinct was to use ImageMagick, as I had some experience with its features and capabilities. The existing bindings were rather lacking, so I opted to simply call magick as an external process. While this mostly worked, it was always a pain to get it to install and link correctly across platforms, and the performance was surprisingly bad for larger images.
Another obvious choice might be OpenCV, but I had some bad memories of using it at university (maybe it was just the C++ context…), and the Haskell bindings looked rather painful.
So, my next experiment was using Hip, and with the help of @lehins himself and @HanStolpo at ZuriHac, we were able to make it work! (Thanks again!)
However, it was still missing some features that I wanted, like binarization with Otsu’s Method. While it would certainly be possible to implement this in Hip, I (for once) felt that Haskell’s abstractions didn’t really help with the task at hand and only complicated things unnecessarily. A for loop in C, by comparison, is conceptually very simple and already as fast as the Haskell code. Luckily, C is a first-class citizen in Haskell and it’s very easy to bundle some C code and call it via Haskell’s FFI.
Unfortunately, there didn’t seem to be a straightforward C library that I could hook up to, so I started to work on FlatCV. I’m quite happy with the UX of using C for the image manipulation algorithms, and I was able to quickly build a fully functioning version with the necessary Haskell bindings. Just a few days ago, I released version 0.3.0, and by now it has most of the basic operations you would expect from an image manipulation library. I also ported some of the higher-level CV operations, like adaptive binarization and corner detection, that I first implemented in Perspectra.
There are still plenty of opportunities to improve the performance of FlatCV: SIMD, GPU usage, streamed processing, etc. However, as FlatCV isn’t used in a real-time context (i.e., 60 fps), the performance is already more than sufficient.
Hope you like it, and I’d be interested to know if you have any use cases for FlatCV!