Samsort: A sorting library

samsort is a lightweight sorting library that operates on GHC MutableArray#s. There are no dependencies outside of base.

Candidate release:
https://hackage.haskell.org/package/samsort-0.1.0.0/candidate

I put it together since I could not find any other library in this niche.
vector-algorithms is the closest and has good implementations of sorting algorithms, but as the name indicates it depends and operates on vectors.

The library is not released yet, since I might tweak the internals some more, but I welcome any feedback in the meantime.

Update

I’ve decided to add another function to sort ints, and I’ve changed both functions to work in ST. This is equivalent to before, but it should be a little more convenient to use this way.

There is now a HOWTO showing how the library can be used.

The benchmark suite has grown larger and covers sort functions from samsort, vector-algorithms, primitive-sort applied to different element types in different distributions. It’s good to see that samsort is the fastest in almost all cases, and comes in second when it’s not :smiley:

18 Likes

Hello, would you mind adding an example usage, for example that demonstrates an array creation, call to samsort and show the sorted result?

The arrays provided with GHC look great, but their documentation doesn’t look very user friendly, like how to get and handle a State#?

2 Likes

I imagine the documentation isn’t great because this is a primitive type provided by GHC and folks are not expected to work with it directly. Instead, various libraries provide nicer interfaces to arrays while using MutableArray# as the internal representation.

Given the above, I think this is quite reasonable. I will add some examples on how to use the library to sort, say, a Vector from vector.

1 Like

Now released! samsort: A stable adaptive mergesort implementation

4 Likes

Very nice! A native, in-place sort is definitely nice to have. But the interface is a bit too low level for most users… have you considered the base-native Array interface? GHC.Arr

1 Like

In terms of interface, a functional one would make the most sense. So the caller provides the comparison and the swap. Go’s sort package is like this sort package - sort - Go Packages It’s more driven by a lack of polymorphism, but it’s still a good way to go about making the algorithm work with primitive arrays, boxed arrays, vectors, and any arbitrary user data structure.

1 Like

Thanks! There are a couple of wrinkles with sorting GHC.Arr's Array/STArray. One is that these arrays can be multidimensional, due to being polymorphic in the index. It is odd that the index would be ignored, instead of having good support for sorting multidimensional arrays (such as sort along one axis). The second is that these arrays are not commonly used (as far as I know). One would more likely have their data in a list, or maybe a Vector. If they have to convert to another structure anyway, it might as well be a MutableArray#.

What I would like is if some of the stuff from primitive moves into base. MutableArray and MutablePrimArray would be at the right level for this, IMO.

1 Like

I have to disagree, I don’t think this is a good model. Having to define a newtype and an instance every time one wants to sort something is tedious. It is tedious enough that Go decided to add the sort.Slice function at one point, even at the expense of type-safety.

It doesn’t have to be a type class though. You can implement the algorithm taking the interface functions as inputs. No newtypes needed.

That said, idt what you’ve done here is bad at all :blush: I was just suggesting a way to have sorting be data structure agnostic since it came up.