Perhaps a silly question, but I was wondering why, given Haskell’s lazy nature, runtime system and other nice features for concurrency, functions like map
or filter
and others aren’t concurrent by default?
Hm, with map
,
map f (x : xs) = f x : map f xs
so with laziness, doesn’t f x
need to be evaluated first when the lists is used?
Since map
, filter
et al are just functions:
map :: (a -> b) -> [a] -> [b]
filter :: (a -> Bool) -> [a] -> [a]
(therefore no outside interactions) I’m assuming by “concurrent” you meant “parallel” - breaking up one large task into smaller tasks that can run simultaneously and separately.
For list functions, one immediate problem is working out how to break up the list - unlike e.g. a balanced tree, it’s “middle” isn’t readily accessible. Then there’s the matter of ordering - did you want:
map Data.Char.toUpper "parallel" == "PARALLEL"
or:
map Data.Char.toUpper "parallel" == "PLALRELL"
…keeping the ordering from the input list/s is a frequent requirement:
\ xs -> zip xs (map f xs) -- these should produce the same
\ xs -> [(x, f x) | x <- xs] -- result for the same argument
Being speculative, parallelism (particularly for languages with non-strict semantics like Haskell) presents the risk of doing unnecessary work unless it’s used with care:
take 10 $ map (2^) [0..] -- this could take a while...
These are just a few of the reasons why list functions (mostly the generic ones) are sequential by default in Haskell.
Right, this answers my question
I think I must have been dreaming when writing this, and forgot completely how concurrency, thread spawning etc is IO bound
I was wondering many a things about automatic parallelism in compilation. I’ll keep thinking
Correct me if I’m wrong: I don’t think f x
would have to be evaluated first, iirc (:)
isn’t strict on the first argument.
You are not alone:
For Haskell, that is correct. This provides some more information about that choice:
https://help.luddy.indiana.edu/techreports/TRNNN.cgi?trnum=TR44
(…no actual paper, unfortunately.)
Rewriting map
as:
map f [] = []
map f ((:) x xs) = (:) (f x) (map f xs)
helps to make it clear that (f x)
, or for that matter (map f xs)
don’t need to be evaluated before the result (:) (f x) (map f xs)
is returned by map
:
I was wondering many a things about automatic parallelism in compilation. I’ll keep thinking
That was the idea behind “parallel graph reduction”. Look up the (old) papers
Great stuff! Exactly the sort of thing I was looking for. I’ll read on
You also might want to look for Data Parallel Haskell (DPH), see this Haskell wiki page for a start. Don’t get your hopes up too much though: the latest real activity on that page was in 2012.
Oh, right. f x
portion itself does not need to be evaluated.
Instead, the list needs to retain the order, which might require a few thunks to designate the structure I guess.
…actually, there are two: https://legacy.cs.indiana.edu/ftp/techreports/old_TR44.pdf - it’s faint, but it’s page-for-page. Being scanned directly, there’s no way a quick web search was going to find either of them: thank you.
By the way, if you want a parallel version of map
, you can use the parMap
function in the parallel
library.
https://hackage.haskell.org/package/parallel-3.2.2.0/docs/Control-Parallel-Strategies.html#v:parMap