Hehe i feel like a few thing I said before are being responded too.
Unifying ByteString, Text, and Vector
I am absolutely for deduplicating Vector
, Bytestring
, and Text
as described.
@snoyman says this is orthogonal to breaking up base, and I agree. The first step is can be done by just making some dependencies on vector
from the other packages. I think we agree agree there.
Unify unboxed and storable vectors
I don’t know enough about the stuff in this section but generally sounds nice.
Automatic Boxed vs Unpacked selection
I think the best tool we have for this is runtime reps, but those are egregiously underused. To make them ergonomic we need huge changes to libraries (see https://github.com/ekmett/unboxed), but I also think we need a “monomorphizing”/“templating” quantifier just like Rust or (dare I say) the upcoming Go generics. (Or C++ templates, but no SFINAE and other jankiness that comes from the C++ designers arriving at templating from macro systems rather than type theory quantifiers.)
Put it in base!
I just don’t get why. I am still definitely for breaking up base and against putting this stuff in a non-broken up base.
Right now, Vector is often not used. I think a large part of that is the so-many-vector-types problem I’m trying to solve here.
I agree, and you do solve that! People don’t have an issue using Map
, so I don’t think pulling stuff out of base
is a issue. If you are worried about List
being just too accessible, maybe let’s make List
more external rather than making the unpacked ones more internal .
Another is that it takes so darn long for vector to compile. People don’t want to depend on it.
This is really don’t buy. If we can distribute prebuilt base
, we can distribute pre-built other libraries. Full stop.
As it stands today, most of the I/O functions included in base work on Strings. We almost all agree that we shouldn’t be using String, but its usage persists because it is the only base-approved type. Let’s fix that.
A split base also fixes that by making that stuff less forced to be at the bottom of the dependency graph, and thus free to use better types.
With a solid Vector type in base, we can then begin to develop sets of functionality around that type in external libraries. People can iterate on those designs, and then we can consider standardizing, either in base, in a CLC-approved library, or in a new split-base world.
But we as much more more experiment keeping things among vector
, bytestring
, and text
. If we need to make more cross-cutting changes atomically, let’s put those 3 packages in a monorepo , but leave GHC off the critical path.
Adoption
I hadn’t realized that this post initially came off as a massive breaking change in the language and library ecosystem. So let me clarify. I believe that the changes above can be made immediately with zero breakage (though plenty of effort).
Agree, the internal representations of vector
, bytestring
, and text
are not stable (as far as I know) so we should be able to de-dup without issue.