Possible to compile less of large dependencies?

I’m depending on pandoc, and I’m starting to regret my life choices. Pandoc is of course great, but it takes absolutely forever to compile. I’m using only a tiny subset of pandoc’s full functionality, so it’s frustrating to watch it spend ages compiling a bunch of stuff I’m not going to use.

Does the Haskell ecosystem have anything similar to rust’s “feature flags”? They provide a means for library authors to give library consumers options regarding what functionality from the library they want to compile.

I would assume the answer is probably no, or I probably would have heard about it. So my second question, assuming that’s the case, is whether this feature has ever been discussed in the the context of Haskell. How feasible is it? Is there anything about Haskell that would make it uniquely difficult to implement?

4 Likes

Cabal actually does support configuration flags, and has for a long time. Unfortunately, like many of Cabal’s features, they are hard to discover and hard to use. People will disagree with that statement, but I point to this very post as evidence in my defense.

Configuration flags can certainly be used as “feature flags”, since you can use them to guard the inclusion of extra dependencies and extra exported modules. One example that comes to mind is text: An efficient packed Unicode text type.. It’s not a great example, since it’s hiding alternate implementations instead of whole extra features, but you get the idea.

Another, slightly more common option is to provide multiple related packages, so a consumer can pick and choose their features by picking libraries. amazonka: Comprehensive Amazon Web Services SDK. is an example.

Now that Cabal supports public sublibraries, the “multiple related packages” option might get more common and usable.

So, the answer to your title is, “Sometimes. Maybe more would be better. We need to do a better job documenting the options and benefits to make it more common.”

5 Likes

Supposing I had the will to attack this problem, do you think that the best approach would be to advocate for splitting pandoc into sublibraries?

Edit: Seems like some people are already thinking along these lines

1 Like

Perhaps. The best advocacy would be to prototype a solution, probably.

2 Likes

You can also check out the password library where we have added flags to disable certain algorithms.

If a subset of modules can be compiled all on their own, it’s pretty easy to implement something like a pandoc-core flag that maybe doesn’t export any Text.Pandoc.Readers or Text.Pandoc.Writers. (just an example, I’ve no idea what the dependency tree of the pandoc library looks like)

5 Likes

Pragmatically, often the best thing is to call the pandoc cli, so you don’t have to keep rebuilding it, which as you say is a heavy cost.

Cabal flags are there and some packages use them but most of the time they are avoided because they complicate dependencies, packaging etc. too much.

Note if you just need markdown rendering eg, there are alternatives to pandoc that might be enough.

1 Like

It’s likely that the root cause of slow compilation of pandoc is excessive inlining in text. I would love someone to deal with it.

1 Like

This is bad practice, please don’t do that.

library
  hs-source-dirs:
      src
  exposed-modules:
      Data.Password.Validate
  if flag(argon2)
    exposed-modules:
        Data.Password.Argon2
  if flag(bcrypt)
    exposed-modules:
        Data.Password.Bcrypt
  if flag(pbkdf2)
    exposed-modules:
        Data.Password.PBKDF2
  if flag(scrypt)
    exposed-modules:
        Data.Password.Scrypt

What people forget here is that you can’t enable flags of dependencies through the cabal file.

If someone wrote a library depending on Data.Password.Scrypt, which happens to be in your depgraph, and the user disabled -scrypt, you’ll get an obscure compilation failure. If your feature flag is off by default, then that actually makes your package broken for hackage.

Exposed API should not depend on flags: Educate (or prevent) users from making flag-dependent API · Issue #8128 · haskell/cabal · GitHub

9 Likes

Oh, you mean in the case where a dependency uses password under the hood, and the user wants to ALSO use password on its own, but wants to disable a module? :thinking: That’s an interesting situation.

We mainly added these because on some platforms one of the algorithms would fail to compile. What would be your advice in that case? Instead of making it possible for someone to use the one library on more platforms, make a separate library for every single algorithm?
I’d like to know what good practice in this case would be.

That is just an example.

The point is: if you make API flag dependent, then other packages using your package need a way to express what flags they require from your package.

But there is no such way. That means cabal has insufficient information to do the right thing. That can manifest in different ways.

I don’t know what “fail to compile” means exactly. If you are depending on platform specific headers/libraries, then that should be in its separate library, yes. You can factor out the common parts into a password-internals package if there are any.

1 Like

Yeah, it feels icky but it’s obviously the right choice for me.

Yes, build flags are really annoying compared to just depending on a package. For example, I’m currently fighting "The preview server is not enabled in the version of Hakyll" · Issue #1022 · jaspervdj/hakyll · GitHub

In that case, I take back what I said about configuration flags being usable as “feature flags”.

1 Like

cargo for Haskell would be great.

1 Like

cabal is definitely coming along well, but it’s not yet at the same level of ease of use.

I think cabal is pretty close to cargo already. It’s actually better in several ways. Feels like the gaps currently are just defaults and interface. The core is at parity between Nix builds and freeze files.

Rust doesn’t have an integrated bytecode repl for instance…so it definitely has some huge flaws imo. You don’t get that sort of thing overnight like you can with a CLI change.

There are a large number of minor papercuts:

  • No cabal new, only init means I have to do a separate mkdir myself
  • cabal init makes me go through a wizard by default, whereas cargo new just creates the project and I can change stuff when I want to. I don’t want to have to look up the --non-interactive/-n flag in order to specify it, it should be the default behaviour.
  • The wizard makes me choose from a bunch of options that I don’t care about in the common case. It has sensible defaults, so it should just use them. Eg. I don’t think people generally care about specifying the version of their dependency file format. They want the latest stable one. (aside, it’s not clear to me why Cabal 3.4 is not the default?). If you specify --libandexe you have to hit enter like 20 times. This would be fine if the wizard was opt in instead of opt out.
  • cabal files are a custom format, as opposed to cargo’s use of toml. cargo.toml files generally seem much simpler to me than an equivalent foo.cabal. In particular it’s nice to not have to indent dependencies.
  • having to specify other-modules is annoying and seems like pointless busywork
  • cargo puts your built executable in ./target/debug/example-exe-name whereas cabal buries it deep in the bowels of dist-newstyle: ./dist-newstyle/build/x86_64-linux/ghc-9.4.8/plaintextify-0.1.0.0/x/ptfy/build/ptfy/ptfy
  • passing flags to executables through cabal run doesn’t seem to work correctly
⮞ cabal run --help|head -3
Run an executable.

Usage: cabal run [TARGET] [FLAGS] [-- EXECUTABLE_FLAGS]
⮞ ./dist-newstyle/build/x86_64-linux/ghc-9.4.8/plaintextify-0.1.0.0/x/ptfy/build/ptfy/ptfy --help
Plaintextify

Usage: <... snip ...>

⮞ cabal run -- --help
Error: cabal: Unknown target '--help'.
There is no component '--help'.
The project has no package '--help'.

I bet I could find a bunch more if I kept noting them down them while actually working on stuff, as opposed to just opening a project up to make this point.

I want to be very clear, I think the cabal team has been doing a fantastic job improving cabal, particularly recently. Rust has a much larger userbase, and accordingly there are more resources poured into making it nice. That’s difficult to compete with. I appreciate that open source maintainers are always operating under resource constraints and generally doing it out of the goodness of their heart. I don’t mean all this as criticism of anybody, just potential areas of improvement.

2 Likes

You have to specify what to run:

cabal run ptfy -- --help

Yeah, but cabal run without specifying the exe successfully runs the default one (good!), so you ought to be able to pass it arguments.

Edit: Looks like there’s an open issue

3 Likes