I wrote a tiny bit faster implementation of Data.List.inits that replaces the underlying purely functional queue with a mutable data structure and careful uses of unsafePerformIO. Can we use linear types instead of unsafePerformIO to encapsulate this as a purely functional data structure?
Implementing inits with a queue
The high-level structure of inits :: [a] -> [[a]] is the same between the standard library one and my version: scanl the input list with a queue as an accumulator, snoc-ing (pushing) elements to it. Then every prefix of the input list can be obtained by repeated uncons-ing (popping) from one of the accumulated queues.
-- pointful
inits :: [a] -> [[a]]
inits xs = empty : go empty xs
where
go q [] = []
go q (x : xs) = toList (snoc q x) : go (snoc q x) xs
-- pointfree
inits :: [a] -> [[a]]
inits = map toList . scanl' snoc empty
-- Queue API
empty :: Q a
snoc :: Q a -> a -> Q a
toList :: Q a -> [a] -- unfoldr uncons where uncons :: Q a -> Maybe (a, Q a)
This must be a real-time queue—uncons should run in non-amortized constant time—so that the output can be consumed lazily (notably map head . tail . inits :: [a] -> [a] should take linear time). This must be a persistent queue, because the same queue will be passed as an argument to both snoc (to construct the next prefix) and uncons (to produce the current prefix).
Implementing a persistent queue
The standard persistent queue is the banker’s queue described in Okasaki’s book. For inits, we also know that the only elements that will ever be pushed to the queue are the elements of the list. This allows a more efficient representation using a mutable (and resizable) array to hold the elements.
Every queue is a slice of the array: snoc appends an element at the end of the array, and uncons reads the first element of the slice. This is not a general purpose queue: we are assuming that every snoc is followed by at most one snoc, and uncons is never followed by a snoc. Under those assumptions, that data structure actually implements a purely functional interface: every queue is an immutable slice of the array; snoc only ever writes where no past slice can access.
Can we give pure (non-IO) types to the operations of that queue, and implement inits with it? Linearity is necessary to encode the usage assumptions, but is it sufficient?
Even ignoring the specifics of the implementation, I can’t imagine any linear types for a queue expressive enough to implement inits. The problem can be seen in the above definition of inits (which may be too naive to account for linearity): how can we reconcile the linearity of q with the recursive call go (snoc q x) under an unrestricted :? (Careful readers will also note the double occurrence of q, but that can easily be solved by merging toList and snoc as a single function.)
benchmarking initsA
time 268.8 ms (259.4 ms .. 281.3 ms)
0.998 R² (0.994 R² .. 1.000 R²)
mean 281.6 ms (274.4 ms .. 288.0 ms)
std dev 8.417 ms (6.315 ms .. 10.18 ms)
variance introduced by outliers: 16% (moderately inflated)
benchmarking initsQ
time 305.0 ms (293.4 ms .. 319.2 ms)
0.999 R² (0.998 R² .. 1.000 R²)
mean 308.5 ms (303.5 ms .. 315.1 ms)
std dev 7.385 ms (4.581 ms .. 11.01 ms)
variance introduced by outliers: 16% (moderately inflated)
Just out of curiosity - why not use runST and ST s a instead of unsafe... I/O entities?
The technical reason is that runST doesn’t let you write lazy algorithms. When you force a runST thunk, all of the allocations, reads and writes in it happen at once. Anything that accesses a mutable array or other resource allocated in ST s must mention the s parameter in its type. When you want a thunk of type [a] to be accessing such a resource, there’s no room for that s parameter.
The moral reason is that I want to write algorithms declaratively.
On my machine (aarch64) initsA is faster than initsQ or initsT, both using criterion and tasty-bench:
$ cabal run inits.hs
benchmarking initsA
time 175.6 ms (174.8 ms .. 176.9 ms)
1.000 R² (1.000 R² .. 1.000 R²)
mean 174.4 ms (173.9 ms .. 175.1 ms)
std dev 825.8 μs (663.4 μs .. 1.072 ms)
variance introduced by outliers: 12% (moderately inflated)
benchmarking initsQ
time 209.8 ms (206.6 ms .. 212.2 ms)
1.000 R² (1.000 R² .. 1.000 R²)
mean 211.9 ms (210.7 ms .. 212.5 ms)
std dev 1.274 ms (671.9 μs .. 2.050 ms)
variance introduced by outliers: 14% (moderately inflated)
benchmarking initsT
time 201.9 ms (200.4 ms .. 205.3 ms)
1.000 R² (0.999 R² .. 1.000 R²)
mean 201.5 ms (201.0 ms .. 202.8 ms)
std dev 1.045 ms (174.6 μs .. 1.506 ms)
variance introduced by outliers: 14% (moderately inflated)
$ cabal run inits.hs
All
initsA: OK
176 ms ± 17 ms
initsQ: OK
212 ms ± 19 ms
initsT: OK
205 ms ± 16 ms
My mistake: from your response here, I thought you were looking for a solution that wasn’t (as) lazy. The original (lazier) version of ST s a is still available - it’s just that most prefer the strict version to avoid space leaks…
The moral reason is that I want to write algorithms declaratively.
But (in this definition) using unsafe... entities - this could be of some use: