Internal modules

A lot of Haskell packages expose internal modules (e.g. containers, text, bytestring). Do you know any languages where it is also common to intentionally expose internal modules (and I don’t mean languages where everything is automatically exposed due to not being able to make private modules)? I don’t remember seeing this in any other language, so I’m curious why Haskell seems to be the only language doing this.

1 Like

I think the main reason (correct me it I am wrong) is for testing purpose.
Because if something is not exposed there is no other way to test it.
Maybe other languages offer an escape hatch that Haskell doesn’t.

1 Like

I never got this argument. If it’s internal then it’s invisible to users and testing it isn’t important, you’re going to have to test the public API anyway which is going to exercise the same internal routines.

3 Likes

One reason is that Haskell type classes are quite unique in that they are “open” (we can define extra instances later), rather than being “closed” (statically defined in a single library). As such, we sometimes offer .Internal modules that expose the internals of a type that doesn’t guarantee that assumed invariants hold (e.g., the order of items in a list, no duplicates, etc), and put the onus on the person using the internals. This means the library can offer a safe/blessed API, but doesn’t completely stop people in their tracks if they want to extend it/combine it with other libraries.

2 Likes

They might not be important but some internal functions need unit testing and there are no easy way to test them individually if they are not exposed. I’ve done it a few times but I haven’t invented this practice.
I don’t remember where I’ve seen it first.

Right, this is mostly the bit that I disagree with. The important internal functions will be exercised by the public API at some point. The only moderately convincing argument is when the only uses of the public API are all performance intensive/speak to IO, at which point the test setup/execution becomes a dominating factor. But at this point I’d rather find a good public interface to the “internal” code and bless it into a proper sub-API, which again, can be tested without needing to expose internals.

Code organisation depends on lots of thing. The main production app I am working on is about managing boxes in a warehouse and it involves finding how many boxes of a given size can you fit on shelf.

The main “public” function is something along Box -> Shelf -> [(Orientation, Offset)] and it’s made using simple function like howMany :: Box -> Orientation -> Shelf -> Int, bestOrientations :: Box -> Shelf -> [(Orientation, Int)] etc …

The overall stuff is trickier than it looks like and it’s much more easier to test each small functions individually than testing the main one. Having a sub API is just an overkill, they are just local functions.

See this post from 2013

The usual convention is to split your module into public and private parts, i.e.

I’m not saying I agree with it but some people do.

Sorry, I wasn’t trying to say that I think everyone who does this is wrong. Disagree is probably the wrong term to use! It’s just not my preferred approach. I understand why people do it, but I think it just starts to add much more commitment to the internal API. When you test right at the external boundaries of a system, you give yourself much more freedom to internally move things around.

That said, I think I’ve probably dominated this conversation enough now and will give it back to @konsumlamm so they can hear about other reasons for this .Internal pattern :slight_smile:

I know this is a small data point, but Rust has a very similar system (traits), yet I don’t see Rust libraries commonly exposing their internals (sure, some unsafe functions to access internals, but not exposing the full internals).

The Rust module system is also more flexible and lets you define submodules without changing your file structure. Tests leverage this, allowing you to unit tests internals without actually exposing them.

I believe this can be achieved with Cabal as well, by defining two libraries, one that exports everything (tests are written against it) and one that only exposes the public API. I haven’t really seen this in the wild though.

1 Like

Anothe common trick is to add the relevant modules to the test suite as other-modules. The test suite will use the module directly without going through the library’s public API.

5 Likes

Sure, but it’s still easier to get correct behaviour more quickly when you test at finer levels of granularity, just as a matter of programming ergonomics. You could even delete the tests after you’d written the function.

1 Like