Executable size

I’m evaluating Haskell for writing simple CLI tools. I have (almost) none experience with Haskell in production.

As for the code, Haskell just rocks. Well, I may be biased, since I love FP, and Haskell syntax is very clean (comparing to OCaml, or Go). The compiled executable, though, is very big.

I wrote a simple app, to download movies from some streaming platform (using yt-dlp). All it does is: (1) check the available audio and video formats, (2) ask user, which formats should be downloaded, (3) downloads the files, (4) merge them into the final mp4, using ffmpeg. Nothing complex.

The compiled binary, after stripping (and using this -split-sections GHC option), is 9.4MB. I have written the same app in both Go and OCaml, achieving binary sizes of 1.2MB and 2.8MB, respectively. I have also noticed, that the Haskell binary is linked to the libgmp library, while the OCaml’s binary is not. (I find it strange, since this library shouldn’t be used at all in such a simple app…)

As for the Go’s executable, I get rid of the fmt package, which adds much to the size, so this 1.2MB is a little size-optimized.

Is there any sane way to limit executable size in Haskell? (Well, I could have “invented” my own ExceptT monad, and all needed lift functions, instead of using mtl, but it doesn’t seems like a sane solution…)

Or, perhaps, the size is not a problem at all, since further source code growth won’t cause the executable size growth (too much)?

6 Likes

Unfortunately, executable sizes in Haskell are significant.

I have also noticed, that the Haskell binary is linked to the libgmp library

Not sure how OCaml handles big integers, but Haskell uses libgmp for arbitrary-precision integers. It’s tied to base whether you like it or not. Same as it links libpthread even if you don’t fork any threads. You can easily get executables of a couple hundred MBs with Haskell.

One thing you might try to do is to link executables dynamically, but if you want a static binary, there isn’t much you can do. I guess that in general GHC doesn’t take many steps to reduce sizes, because it isn’t generally an issue for the community.

4 Likes

It won’t change the in-memory footprint (AFAIK), but running upx on a statically compiled binary reduced size significantly. If I recall, it was about 10% of the original size after compression.

5 Likes

The GHC user’s guide lists these things:

11.3. Smaller: producing a program that is smaller

Decrease the “go-for-it” threshold for unfolding smallish expressions. Give a -funfolding-use-threshold=0 option for the extreme case. (“Only unfoldings with zero cost should proceed.”) Warning: except in certain specialised cases (like Happy parsers) this is likely to actually increase the size of your program, because unfolding generally enables extra simplifying optimisations to be performed.

Avoid Prelude.Read.

Use strip on your executables.

6 Likes

Which platform and architecture are you targeting.

3 Likes

Thank you for ALL your answers!
I haven’t expected such interest.

As for the platform, my main target is GNU/Linux.

2 Likes

Would “runghc” an acceptable quasi-scripting alternative for this use case?

1 Like

So you package an entire GHC installation with your program that is around 2 GB?

1 Like

probably not :slight_smile: I guess it’s only for local scripting, so I am probably missing the mark here.

fwiw, in comparison to a more common scripting language in terms size shipped:

$ rpm -q --queryformat "%{NAME} %{SIZE}\n" python3-libs python3
python3-libs 33170288
python3 33316
1 Like

Quasi-scripting? Yes. Rely on runghc? Not sure. What if I need some external dependencies, currently managed with cabal?

For Bash / Python / Golang I don’t need to worry about external dependencies, since they are already in the system and/or standard library. For Haskell (and OCaml), I do. It’s not a problem, as long as I can compile my app on my machine, and then distribute the binary to other machines.

1 Like

Using stack (example: tiny-games-hs/pong2.hs at main · haskell-game/tiny-games-hs · GitHub):

#!/usr/bin/env -S stack script --resolver lts-20 --package ansi-terminal-game

Using cabal run:

#!/usr/bin/env -S cabal run --index-state=2023-03-05T09:21:17Z
{- cabal:
build-depends: base, ansi-terminal-game ==1.8.1.0
ghc-options:   -threaded
-}
import Terminal.Game;main=playGameS(Game 20(10,10,1,1,10,0)l d e)>>=finish
e(x,y,a,b,z,s)=x<2&&(y<z||y>z+8);l _(x,y,a,b,z,s)e=(x+a,y+b,f 79 x a,
 f 21 y b, min 15$max 2$z+case e of{KeyPress 'w'-> -1;KeyPress 's'->1;_->0},
 s+if x<2then 1 else 0);d r(x,y,dx,dy,z,s)=mergePlanes(blankPlane 80 24)
 [((1,1),box 80 23 '█'),((2,1),blankPlane 79 21),((z,1),box 1 8 '▕'),((y,x),
 box 1 1 '⬤'),((24,60),stringPlane(show s))]
f m x a=if x<3 then 1 else if x>m then -1 else a
finish (_,_,_,_,_,s)=putStrLn$unwords["You scored",show s,"points!\n"]
-- ^10 ------------------------------------------------------------------ 80> --
{- hackage-10-80/pong (gergoerdi)

-}

But I digress from the original post.

3 Likes

Since you mentioned CLI tools in the plural, the busybox scheme of having multiple programs in one executable works perfectly fine in Haskell. Create multiple symlinks to your haskell executable and then use getProgName to get the name of the symlink used to invoke your exe.

4 Likes