Why isn’t map and friends concurrent by default?

Perhaps a silly question, but I was wondering why, given Haskell’s lazy nature, runtime system and other nice features for concurrency, functions like map or filter and others aren’t concurrent by default?

:slight_smile:

4 Likes

Hm, with map,
map f (x : xs) = f x : map f xs
so with laziness, doesn’t f x need to be evaluated first when the lists is used?

Since map, filter et al are just functions:

map :: (a -> b) -> [a] -> [b]
filter :: (a -> Bool) -> [a] -> [a]

(therefore no outside interactions) I’m assuming by “concurrent” you meant “parallel” - breaking up one large task into smaller tasks that can run simultaneously and separately.

For list functions, one immediate problem is working out how to break up the list - unlike e.g. a balanced tree, it’s “middle” isn’t readily accessible. Then there’s the matter of ordering - did you want:

map Data.Char.toUpper "parallel" == "PARALLEL"

or:

map Data.Char.toUpper "parallel" == "PLALRELL"

…keeping the ordering from the input list/s is a frequent requirement:

\ xs -> zip xs (map f xs)     -- these should produce the same
\ xs -> [(x, f x) | x <- xs]  -- result for the same argument

Being speculative, parallelism (particularly for languages with non-strict semantics like Haskell) presents the risk of doing unnecessary work unless it’s used with care:

take 10 $ map (2^) [0..]  -- this could take a while...

These are just a few of the reasons why list functions (mostly the generic ones) are sequential by default in Haskell.

1 Like

Right, this answers my question :slight_smile:

I think I must have been dreaming when writing this, and forgot completely how concurrency, thread spawning etc is IO bound

I was wondering many a things about automatic parallelism in compilation. I’ll keep thinking :slight_smile:

Correct me if I’m wrong: I don’t think f x would have to be evaluated first, iirc (:) isn’t strict on the first argument.

You are not alone:

For Haskell, that is correct. This provides some more information about that choice:

https://help.luddy.indiana.edu/techreports/TRNNN.cgi?trnum=TR44

(…no actual paper, unfortunately.)


Rewriting map as:

map f []         = []
map f ((:) x xs) = (:) (f x) (map f xs)

helps to make it clear that (f x), or for that matter (map f xs) don’t need to be evaluated before the result (:) (f x) (map f xs) is returned by map:

1 Like

I was wondering many a things about automatic parallelism in compilation. I’ll keep thinking :slight_smile:

That was the idea behind “parallel graph reduction”. Look up the (old) papers :slight_smile:

2 Likes

Great stuff! Exactly the sort of thing I was looking for. I’ll read on

You also might want to look for Data Parallel Haskell (DPH), see this Haskell wiki page for a start. Don’t get your hopes up too much though: the latest real activity on that page was in 2012.

Oh, right. f x portion itself does not need to be evaluated.
Instead, the list needs to retain the order, which might require a few thunks to designate the structure I guess.

Luckily, IU does have a scanned PDF up here: https://legacy.cs.indiana.edu/ftp/techreports/TR44.pdf

…actually, there are two: https://legacy.cs.indiana.edu/ftp/techreports/old_TR44.pdf - it’s faint, but it’s page-for-page. Being scanned directly, there’s no way a quick web search was going to find either of them: thank you.

1 Like

By the way, if you want a parallel version of map, you can use the parMap function in the parallel library.

https://hackage.haskell.org/package/parallel-3.2.2.0/docs/Control-Parallel-Strategies.html#v:parMap

1 Like