Edit: I also posted the question on StackOverflow (since there is a large audience) with some edits/ameliorations on the question. Here is the link:
I still leave the nonedited question below for further references but you should go see the one on StackOverflow. Also, you can delete this post if it is better to only have the question in one place.
Hello,
I am currently reading about functional data structures and algorithms and I tried different quick sort implementations. I noticed, however, that there is a lot of variation in their performances.
Here are some selected ones that I want to discuss (all the programs have been compiled for test, and the list to be sorted was the (worst case already sorted (which implies that the time taken is in O(n^2))) [1 . . 20000] list):

qs1 [] = [] qs1 (pivot : rest) = qs1 lower ++ [pivot] ++ qs1 upper where lower = [ x  x < rest, x <= pivot ] upper = [ x  x < rest, x > pivot ]
This is the classic sort we see when we learn the language. Here, the
rest
list is traversed twice to filter first the elements less or equal than the pivot, then the elements greater than the pivot. The time taken was 6.45 sec. 
qs2 [] = [] qs2 (pivot : rest) = qs2 lower ++ [pivot] ++ qs2 upper where (lower, upper) = foldr (\x (l, u) > if x <= pivot then (x : l, u) else (l, x : u)) ([], []) rest
I was surprised by this one. It is the same as above, except that it only traverses the
rest
list once. It did worse than the first with a total of 8.11 sec. I tried thepartition
function from Data.List library, but it only did worse (much worse), with a catastrophic 10.99 sec. I checked the implementation, and yet it is almost the same as my lambda function, although it relies on an utility function:partition :: (a > Bool) > [a] > ([a],[a]) partition p xs = foldr (select p) ([],[]) xs select :: (a > Bool) > a > ([a], [a]) > ([a], [a]) select p x ~(ts,fs)  p x = (x:ts,fs)  otherwise = (ts, x:fs)
(Taken from https://hackage.haskell.org/package/base4.14.0.0/docs/src/Data.OldList.html#partition).

qs3 [] s = s qs3 [x ] s = x : s qs3 (pivot : rest) s = qs3 lower (pivot : (qs3 upper s)) where lower = [ x  x < rest, x <= pivot ] upper = [ x  x < rest, x > pivot ]
In this one, there are 2 novelties. Fist, the removal of append
++
in favour of cons:
. Second, it is a tail recursion function, so it should (in principle) be faster. However, it is merely better than the first one, with a time of 6.42 sec. In fact, due to variation from an execution to the other, it probably is the same. 
qs4 [] s = s qs4 [x ] s = x : s qs4 (pivot : rest) s = qs4 lower (pivot : (qs4 upper s)) where (lower, upper) = foldr (\x (l, u) > if x <= pivot then (x : l, u) else (l, x : u)) ([], []) rest
Again, this is the same as above, except that I replaced the 2 list traversal by a
foldr
, and again, the time taken is increased: 8.02 sec. 
split pivot [] lower upper s = qs5 lower (pivot : qs5 upper s) split pivot (h : t) lower upper s  h <= pivot = split pivot t (h : lower) upper s  otherwise = split pivot t lower (h : upper) s qs5 [] s = s qs5 (pivot : rest) s = split pivot rest [] [] s
This one is the fastest. I recorded a stunning time of 2.82 seconds, which is almost 4 times faster than the āslowestā. It bounces between 2 functions that calls each other.
split
is a recursive function that separates the elements of therest
list sent by theqs5
functions, which it returns to when it is done partitioning.
Conclusion: What the hell is going on here? I am puzzled by all the hidden subtleties done during the compilation that makes my expectations concerning the performance of the program go wrong. I humbly thank anyone who could help me untangle the pieces of this jigsaw by pointing out what is going on under the hood.