[RFC] Infinite lists

(I’ve been sitting on this for a week or so, hoping to write proper documentation, but this does not seem happening, so) I’d like to seek some community feedback on a small library for infinite lists:

https://hackage.haskell.org/package/infinite-list-0.1/candidate

What do you think about goals and design decisions?

7 Likes

I’m curious what the use cases are for certainly-infinite lists.

1 Like

This work has been motivated by our discussion at Add {-# WARNING #-} to Data.List.{head,tail} · Issue #87 · haskell/core-libraries-committee · GitHub. To avoid partial Data.List.head in

head $ filter (`notElem` hashes) $ map showt [0::Int ..]

one can use Stream package, where S.head is total:

import qualified Data.Stream as S
S.head $ S.filter (`notElem` hashes) $ S.map showt $ S.iterate (+1) (0 :: Int)

This has two disadvantages:

  • S.iterate (+1) (0 :: Int) isn’t neat. My package offers (0...) syntax instead.
  • There is no fusion happening, allocating thunks of S.Stream on every step of pipeline. My package provides foldr / build fusion and eliminates all intermediaries.
2 Likes

It’s hard to look at the code on mobile, because:

  1. the github repo doesn’t exist
  2. the Internal module where the type is defined is not exposed (thus can’t see it on hackage)
1 Like

The API looks great to me and I’m happy to see such package existing! :tada:
The (from...) syntax is a clever usage of postfix operators :sunglasses:

I have a few ideas and a few potential feature requests. But that can be discussed separately. This is already a huge amount of work for the first release! The Data.List API is huge and just walking through the entire API and extracting functions that work for infinite lists is a great amount of deal :exploding_head:

I personally worry sometimes about functions that can work with infinite lists because they can accidentally hang. For example, the library provides the following function:

find :: (a -> Bool) -> Infinite a -> a 

Which is semantically correct, but practically someone accidentally can write the following code which can hang:

ghci> import Data.Time.Clock.POSIX
ghci> now <- getPOSIXTime 
ghci> find (> now + 1) (now...)  -- or something in this spirit

which is wrong for other reasons, but still, would be nice to avoid introducing new footguns.

I’m thinking about having a functions that accept a limit: how many entries do you look at before stopping? But for now, a warning in the documentation should be enough :relieved:


As for the original code like this:

head $ filter (`notElem` hashes) $ map showt [0::Int ..]

Sorry, this is just bonkers code to me and simply suboptimal. If someone passes [0::Int ..] as hashes, this function hangs. Because I’m afraid of dealing with infinite lists, I would simply write this code differently.

Though, I understand the sentiment of having a drop-in replacement and sometimes it would be nice to work with infinite streams so I’d love to have package like infinite-list in the ecosystem :relieved:
It’s also very interesting from the perspective of learning and understanding laziness in Haskell! Not to mention, that it’s really curious to look at the code to learn something new :eyes:

2 Likes

The Data.List.notElem function will always get stuck if you pass in an infinite list (and the element is not in the list). If you are really worried about that, then I’d just change it to Data.Vector.notElem which is guaranteed to get a finite vector as input. In fact, we can’t really know if the notElem in the given code was from Prelude or Data.Vector.

@hasufell there is nothing fancy inside of Data.List.Infinite.Internal, just

data Infinite a = a :< Infinite a

@ChShersh thanks for the review! I’m happy to hear about potential feature requests; it’s easier for me to make everything in one go.

I’m happy to hear about potential feature requests; it’s easier for me to make everything in one go.

I have only two feature suggestions at the moment:

  1. Have an interleave function to interleave two infinite lists, like this:

    interleave :: Infinite a -> Infinite a -> Infinite a
    interleave (x :< xs) (y :< ys) = x :< y :< interleave xs ys
    

    An alternative implementation could be more polymorphic:

    interleave :: Infinite a -> Infinite b -> Infinite (Either a b)
    interleave (x :< xs) (y :< ys) = Left x :< Right y :< interleave xs ys
    

    I’m happy with everything as I can easily implement one through another but interleaving infinite lists is something I would want quite often.

  2. I wonder what is the trade-off for using Word instead of Natural for indexing functions like this one and could this trade-off be documented?

    (!!) :: Infinite a -> Word -> a
    

    I understand that returning a number higher that maxBound :: Word is impossible practically but all other functions that work with lists use Int and not Word so it’ll introduce an extra conversion layer.

  3. The behaviour of something like ((maxBound :: Int)...) is quite unfortunate. Would it be possible to implement some functions that cycle over the entire range of numbers? I imagine having a function like:

    cycleFromAscending  :: (Enum a, Bounded a) => a -> Infinite a
    cycleFromDescending :: (Enum a, Bounded a) => a -> Infinite a
    

    that properly handles the maxBound and minBound cases.

That’s all from me :slightly_smiling_face: Otherwise, it looks great! And I also really appreciate that this package is lightweight! Can’t wait to find a usage for it in my projects :slightly_smiling_face:

I would perhaps document the Applicative instance saying that it’s zip-like, unlike the default one for regular lists.

The Semigroup one is also non-obvious. What if we had a “pointwise mappend” instead? Monoid m => Monoid (Infinite m).

@danidiaz that’s interesting points, thanks. I think I’d rather remove instance Semigroup altogether: the current one with (<>) = const is quite pointless, and instance Applicative already provides zip-like semantics.

@danidiaz thinking of ZipList-like behaviour of instance Applicative Infinite, I realised that on contrary to ZipList we can have a lawful instance Monad Infinite, unlocking list comprehensions syntax under {-# LANGUAGE MonadComprehensions #-} pragma.

2 Likes

@ChShersh thanks again!

  1. Done, excellent point, I always regret that there is no interleave in Data.List.

  2. Using Word instead of Natural is a better trade-off performance-wise. Conversion between Int and Word is almost free, because on hardware level this is a no-op. And recursive worker with a strict Word gets unpacked to Word#, so there are no additional heap allocations. Both of this does not hold for Natural: conversions to and from Int require some work, and it cannot easily be unpacked to an unlifted and unboxed value. Given that from practical point of view indices over 2^64 will just hang almost infinitely, I think Word is a better choice in this case.

  3. That’s a good point, but I’m not sure about extending this API. Morally (...) is purposed for cases when a list is “practically” infinite: [(0::Int)..] or [(42::Int)..] are fine, even while in theory both a finite, but [maxBound..] is really an edge case, for which I decided to cycle instead of throwing an error.

    As a consumer I’d probably prefer to use an explicit iterate call instead of hypothetical cycleFrom{A,De}scending, making semantics transparent without refering to documentation. Say, iterate (\n -> if n == maxBound then minBound else n + 1) from.

    I’d love to have (-..) upto = [upto, upto - 1 ..], but upto - 1 explodes for upto = 0 :: Natural, and adding Bounded constraint to be able to compare upto == minBound prohibits (-..) on Integer.

    One possibility is to add

     (....) :: (a, a) -> Infinite a
     (....) (from, then) = unsafeCycle $ enumFromThen from then
    

    so that (1,3).... generates an infinite list equal to [1,3..], and (maxBound, maxBound-1).... gives a descending (practically) infinite sequence. Isn’t this too much of syntactic sugar?..

1 Like

The monad instance bind operator results in:

xs >>= f
  = f (xs !! 0) !! 0
 :< f (xs !! 1) !! 1
 ...

Why not fill the front corner with a triangle:

xs >>= f
  = f (xs !! 0) !! 0 -- sum 0
 :< f (xs !! 0) !! 1 -- sum 1
 :< f (xs !! 1) !! 0 -- sum 1
 :< f (xs !! 0) !! 2 -- sum 2
 :< f (xs !! 1) !! 1 -- sum 2
 :< f (xs !! 2) !! 0 -- sum 2
 :< f (xs !! 0) !! 3 -- sum 3
 ...

In this way every element in the result will eventually show up.

@1chb then the Applicative and Monad instances wouldn’t match up. E.g. Control.Monad.ap = (<*>) would fail to hold.

1 Like

But couldn’t the Applicative instance be modified in the same way?

@1chb does your instance Monad Infinite satisfy the associativity law?

Yeah, I guess you can of course always get an applicative from a monad. A better argument would be that the law fmap f x = pure f <*> x would fail to hold (if my mental math is correct).

@Bodigrim I guess not:

xy z xyz s    x yz xyz s
00 0 000 0 == 0 00 000 0
01 1 001 1 == 1 01 001 1
10 2 010 1 /= 2 10 100 1
02 3 002 2 /= 3 02 010 1
11 4 011 2 /= 4 11 101 2
.... 100 1 /= .... 200 2
.... 003 3 /= .... 002 2
.... 012 3 /= .... 110 2
.... 101 2 /= .... 201 3
.... 020 2 /= .... 300 3
.... 004 4 /= .... 011 2
.... 013 4 /= .... 102 3
.... 102 3 /= .... 210 3
.... 021 3 /= .... 301 4
.... 110 2 /= .... 400 4

I am not so fond of the name foldr1 in this library. I personally would prefer it be called foldr, because it is the standard fold for this data structure.

I guess this name was chosen because you don’t have to give a base case just like foldr1 on lists, but for me the 1 suffix also signifies partiality and the restricted type a -> a -> a for the first argument. Neither is the case for this fold.

(I mean, this fold over infinite lists is also partial, but it at least has some partial applications (no pun intended) which result in functions that are not partial.)

With regards to partiality, it’s time to change your mental model, the total foldr1 :: Foldable1 t => (a -> a -> a) -> t a -> a is coming soon in GHC 9.6.

I don’t really know what is worse: foldr :: (a -> b -> b) -> Infinite a -> b, reducing the number of arguments, or foldr1 :: (a -> b -> b) -> Infinite a -> b, which is a strictly more polymorphic than Foldable.foldr1. I chose the second option, but now I see that others find it perplexing. I’m certainly not up to the first one: my poor brain will look for a base argument of foldr forever. Maybe a completely different name is needed? But foldrInfinite is too mouthful… I’m open to suggestions.

1 Like