Clashing interests: Haskell and Machine Learning

Hi! I’m currently doing my fourth year of CS and am specializing in machine learning. On my spare time I’d like to reconcile the two interests, but I don’t see any obvious way. Do you have any suggestions for a project that integrates both?

2 Likes

So many!

If you’re not already familiar, you may glance at GitHub - hasktorch/hasktorch: Tensors and neural networks in Haskell, you may also be interested in the work of Myrtle and Clash, not to mention ad

Personally, I think there’s an extremely interesting intersection of type-safety and “combining” of ML variables; I can’t find the paper at the moment, but a while back there was work on this front in terms of using “unit” information in the tensors to build networks “safely” that don’t allow to accidental multiplication of, say, integer vectors and small numeric vectors. In general building a kind of rich type safety in the vector operations of the ML networks is very interesting, I think, and potentially very rich.

I guess the question is - what parts of each interest you the most? Maybe if you elaborate we can help you find something that plays in both spaces :slight_smile:

2 Likes

I was vaguely familiar with hasktorch, but I was (maybe unfortunately) exposed to some of Conal Elliotts teachings such as this presentation that talks about remodeling machine learning for FP, which made me a bit disheartened about the current ecosystem.

Of course, I’m not bright enough to criticize any efforts in Haskell so I’ll definitely try to pick up hasktorch. As for what I’ll specifically do, I’m not sure yet! Although I am super interested in Poker (my poker evaluator project), so I might do something in this area!

If possible, I’d try and use Conal’s work as more inspiration than discouragement. I’ve been interested in it too, but I’ve found it hard to find my feet there.

For me, I’ve always found it useful, when stuck, to do something practical, and then go back and revisit the theory when I have a more solid place to stand. Probably different approaches work for other people :slight_smile:

3 Likes

So it’s definitely possible, but you need to gauge your expectations appropriately.

Firstly, I do ML research in Haskell! Here’s a published example: Modelling the neural code in large populations of correlated neurons | eLife

I’m in the middle of a big refactor of my ML library called Goal (GitHub - alex404/goal: The Geometric OptimizAtion Libraries). I’m working on another ML article (hoping to submit by the end of the year), and the library will provide implementations of the theory in the article.

So, on the positive sides: yes its possible! My interest in ML are very theoretical, and using the type system to embody the mathematics of what I’m doing has been extremely helpful. I even implemented backpropagation using laziness by simultaneously recursing on the forward and backward passes. So fun! So elegant!

On the negative sides, if what you want is SOTA performance and maximum engineering, Haskell will likely be in the way. There are numerous good solutions for GPU computing in Haskell, but at the end of the day, the simplicity and vast ecosystem around Python makes it really hard to compete with.

Even languages like Julia which are practically designed for this have a hard time taking mindshare away from Python. This is exacerbated by advances in LLMs, where you can basically tell ChatGPT to throw together a neural network and plots with arbitrary specification, and you’ll likely have the implementation you seek within a day, compared to the week it would take you to develop everything from scratch.

These days I’ve given up on trying to achieve maximum performance + GPUs in Haskell (though I’m pretty sure my implementations are still really well optimized - I still care). I have another article in the pipeline where at a certain point I’ll have to scale up performance, and my plan at that point will just be to throw together the specific algorithm in python, with my general Haskell framework as a reference.

14 Likes

Super cool of you to chime in! This looks awesome, and I’m intrigued by the algorithm for back propagation. I’ll definitely take a closer look!

Btw @alex404 's GOAL library is also a great way to learn about information geometry. I tried reading Information Geometry and Its Applications | SpringerLink and found that by working through the typed definitions in the GOAL library, it was a lot easier to understand what was going on. Lots of fun!

2 Likes

Thanks for your kind words @reuben !

And as for lazy backprop, here’s a blog post I wrote about it: Sacha Sokoloski - Goal Tutorials

It’s slightly out of date with my refactoring of the libraries, but the fundamental ideas are the same.

1 Like

A couple years back I did this experiment, trying to move tensor shape checking to the TH code generation stage : https://openreview.net/pdf?id=5TCfWXk2waG . It works, but it ended up being neither elegant (there are some limitations in typed TH that prevent certain types from being expressible) nor efficient (the programs end up being as large as the data).

Leaving typed TH (and “staged DSLs”) aside, I think the approach has merit since it’s so flexible for generating programs with optimizations the user didn’t even have to declare.