On the topic of unicode table compression there is New library: shamochu “Shuffle and merge overlapping chunks” lossless compression and the corresponding library unicode-data: Access Unicode Character Database (UCD).
They claim 90% savings, but @augustss is claiming 99.7% savings with RLE and LZ. Although they do claim to keep O(1) indexing into the compressed data.
I love the idea and have several use cases for MicroHs but in practice I’m blocked by two problems:
containers
doesn’t build (the latest version requirestemplate-haskell
and I don’t know how to select a different version).- The FFI is too limited. As far as I can see there is currently no way to call MicroHs from another language.
I don’t index in the compressed data. The first time you use a Data Char function outside the ASCII range the compressed data will be inflated into a 1.1M bytestring which is then used on subsequent accesses as well.
So you could use shamochu
to instead inflate it to a ~100K transparently compressed representation in memory.
The containers
package builds, but you need to use the GitHub version, which has the changes for MicroHs.
I’ve not bothered with foreign export
yet. If you need it, create a bug report, and maybe I’ll add it.
That’s a real shame that something as conceptually pure & fundamental as containers
is using TH. Surely whatever code is being automated there could be gen’d and checked in?
Very many packages use TH just to derive a Lift
instance. To make porting packages easier, MicroHs pretends it can do that, but just ignores it.
Just a curiosity: the source code for containers
is bigger than the source code for MicroHs.
Dear Lennart
A student of mine (Olivier Lischer) has using MicroHS to run Haskell on a Raspberry Pi Pico and has a demonstration of a line following robot using it.
You are already aware of this, but I thought I should post this anyway in case anyone else on the thread finds it interesting.
Best regards
Farhad
The latest MicroHs now supports Unicode in the source.
I’m not happy about the change. It added about 50kB to the size of the distributed “binary” and 7% to the time to recompile the compiler. These are unacceptable increases for a feature that is used very little. But I’ve pushed the change while I figure out how to get rid of the excess.
I’d be open to using MicroHs more if it implemented some of the GHC green threading model and concurrency primitives (forkIO
, MVar
) - don’t need everything (like STM
, etc). One thing I like about MicroHS is the fact it can be made retargetable w/ zig cc
(drop-in for clang
/ gcc
). I think this is a major advantage it has over GHC.
I have an old branch that has forkIO
etc. I should pick it up and merge it. More work is needed on how this would interact with the C world.
That would be very interesting. Would the additions of MVar
and IORef
be too much to ask as well?
There’s already IORef. Of course there will be MVar when I add concurrency.
containers-0.8 is now released with MicroHs support: Changelog for containers-0.8 | Hackage
I see that containers-0.8
are not marked as tested-with: MicroHS
, is this an oversight? Also I just tried searching hackage with query (tested-with:MicroHS)
, nothing showed up. What’s the correct query, if any?
containers-0.8 builds with MicroHs, but hasn’t been tested with MicroHs. For one, because the test libraries and their dependencies need to be able to run on MicroHs (containers #1120).
Hackage search is only available for the fields documented at the page. tested-with
is a free-form text field. We could add both free-form cabal file search and searching based on a hypothetical new structured tested-with
successor field. A structured field would need to be added to Cabal first, of course. I can help review hackage-server PRs.
The tested-with
field is not free form: Distribution.FieldGrammar.Newtypes