GHC + Alpine + Docker + static linking < 15 MB

Hi all!

I recently had to revisit this way of deploying binaries and was pleasantly surprised ! 15 MB for a completely portable Hello world Haskell container, building in 30 seconds.

Golfed down the setup to a reasonable few lines (Dockerfile, build.sh and GH Actions file) for your reading pleasure.

Enjoy!

37 Likes

You can get down the binary size substantially by using upx (thanks Hecaté for the tip). Cabal-audit static binaries weigh 4Mb (the build takes longer than 30 seconds though ;))

8 Likes

Great, I had forgotten about UPX. In my PoC above I just strip the binary. Thanks!

First off, thank you for your educational post.

What does this mean? Docker is just a wrapper for various Linux container features, right?

There is no portability across CPU architectures, and there doesn’t seem to be a stable image format either.

For example

$ docker run -it gcc:5.1 /bin/bash
docker: [DEPRECATION NOTICE] 
Docker Image Format v1 and 
Docker Image manifest version 2, schema 1 support
is disabled by default and will be 
removed in an upcoming release. 
Suggest the author of docker.io/library/gcc:5.1 to upgrade the image to the OCI Format or Docker Image manifest v2, schema 2.
More information at https://docs.docker.com/go/deprecated-image-specs/.

(gcc:5.2 works, for comparison)

You can’t run old CentOS images either, without adjusting your host kernel flags and adding vsyscall=emulate, whatever that means. See Bash in CentOS 5* and 6* images crashes if Docker host system is “too new”.

I find reproducibility more interesting than portability. If I can’t reproduce the binary (e.g. because the apt mirrors went down), I can’t port it to a new architecture either.

In my experience, the Docker ecosystem is extremely vulnerable to these kinds of problems, since there is no restriction on network access in containers.

inb4 nix

Cheers, always a pleasure. :slight_smile:
For others: Note that UPX creates “shims” that make it look like the binary is static, but actually links will be resolved at run-time (IIRC?), so this does not spare you from having to have a fully-static binary first and then use UPX on it.

1 Like

The plot thickens! apparently GHC doensn’t like to statically link libraries that contain TH. Will investigate…

You’re out of luck with any GHC built by Hadrian afaiu

(That’s why the ghc version in nixpkgs’ pkgsStatic is pinned to the last version that is still built with make)

Do you have a commit where you are encountering this issue?

Hi, unfortunately not as this shows up in a proprietary library.

I tried this approach as suggested by Ben : #20168: Can't build statically linked executable when Template Haskell is used · Issues · Glasgow Haskell Compiler / GHC · GitLab (starting from a bindist, configure with DYNAMIC_GHC_PROGRAMS=NO etc.) but GHC on Alpine complains about missing linker symbols and I haven’t managed to pin down the issue. I get errors similar to this one: #24044: GHC 9.6.3: Linux (x86-64): Alpine (Static, GMP bignum implementation): GHCi: unknown symbol `__gmpn_rshift' · Issues · Glasgow Haskell Compiler / GHC · GitLab

There is a proposal to validate statically-built GHC in CI which I think would be useful, especially if the validation is done together with actual Hackage libraries.

1 Like

Very Nice !
I wonder what would be a minimal size for a Yesod web server -
Currently, in my open source SAST engine I copy the binary generated by cabal to a Ubuntu image, resulting in images around 140M - it is dynamically linked so I hesitated using alpine for the libc / muslc issues

musl is perfectly fine for dynamic linking! And if you build on top of Alpine, the base image will be much smaller than a Debian.

Could you please elaborate on this? I really don’t see the connection.

I don’t know all the details but this is the issue I could find. Maybe it also applies to you.