There is a new release of interval-patterns available.
This library aims to make working with intervals ergonomic. All types of extremum are supported, with special pattern synonyms for fine-grained control without sacrificing conciseness or legibility. For example:
>>> union (2 :||: 5) (5 :<>: 7)
One (2 :|>: 7)
>>> union (2 :||: 4) (5 :<>: 7)
Two (2 :||: 4) (5 :<>: 7)
>>> difference Whole (3 :<>: 4)
Just (Two (Bottom :|-|: Levitate 3) (Levitate 4 :|-|: Top))
>>> symmetricDifference (1 :<>: 4) (2 :||: 5)
Just (Two (1 :<>: 2) (4 :||: 5))
It also correctly implements several monoids made from sets of intervals. Notably, the (computable) Borel sets are expressible, and form a monoid under union. There is also a notion similar to Map (Interval k) v, but overlapping intervals are combined using v’s Semigroup instance, effectively modelling Layers of e.g. paint over a surface area.
Do give it a try if you have problems to solve using intervals.
I wanted to include the intervals in my Adjacency type in order to maintain the invariant that the arguments of the constructor of Adjacency a are in increasing order. This allows implementing interval operations in terms of the adjacency:
difference ::
forall x. (Ord x) => Interval x -> Interval x -> Maybe (OneOrTwo (Interval x))
difference i1 i2 = case adjacency i1 i2 of
Before i _ -> Just (One i)
Meets i _ _ -> Just (One i)
Overlaps i _ _ -> Just (One i)
Starts{} -> Nothing
During{} -> Nothing
Finishes i _ -> Just (One i)
Identical{} -> Nothing
FinishedBy{} -> Nothing
Contains i _ k -> Just (Two i k)
StartedBy _ j -> Just (One j)
OverlappedBy _ _ k -> Just (One k)
MetBy _ _ k -> Just (One k)
After _ j -> Just (One j)
Very nice, especially symmetricDifference. How would you do a convex hull union? Like unionHull (2 :||: 4) (5 :<>: 7) = 2 :|>: 7?
I do this a lot using numhask-space: Numerical spaces. but I cop out with a convex hull union and have often thought of extending towards a cleaner approach like this.
The convex hull is more involved to calculate, but straightforward to use. interval, lower, and upper work with the internals to create the result.
hull :: (Ord x) => Interval x -> Interval x -> Interval x
hull i1 i2 = case adjacency i1 i2 of
Before i j -> interval (lower i) (upper j)
Meets i _ k -> interval (lower i) (upper k)
Overlaps i _ k -> interval (lower i) (upper k)
Starts i j -> interval (lower i) (upper j)
During i _ k -> interval (lower i) (upper k)
Finishes i j -> interval (lower i) (upper j)
Identical i -> i
FinishedBy i j -> interval (lower i) (upper j)
Contains i _ k -> interval (lower i) (upper k)
StartedBy i j -> interval (lower i) (upper j)
OverlappedBy i _ k -> interval (lower i) (upper k)
MetBy i _ k -> interval (lower i) (upper k)
After i j -> interval (lower i) (upper j)
Internals
-- | Get the @('lower', 'upper')@ bounds of an 'Interval'.
bounds :: Interval x -> (SomeBound (Levitated x), SomeBound (Levitated x))
bounds = \case
l :<-->: u -> (SomeBound l, SomeBound u)
l :<--|: u -> (SomeBound l, SomeBound u)
l :|-->: u -> (SomeBound l, SomeBound u)
l :|--|: u -> (SomeBound l, SomeBound u)
lower = fst . bounds
upper = snd . bounds
-- | Given 'SomeBound's, try to make an interval.
interval ::
(Ord x) =>
SomeBound (Levitated x) ->
SomeBound (Levitated x) ->
Interval x
interval (SomeBound b1) (SomeBound b2) = case (b1, b2) of
(Min l, Sup u) -> l :|->: u
(Min l, Max u) -> l :|-|: u
(Inf l, Sup u) -> l :<->: u
(Inf l, Max u) -> l :<-|: u
(Sup u, Min l) -> l :|->: u
(Sup u, Inf l) -> l :<->: u
(Max u, Min l) -> l :|-|: u
(Max u, Inf l) -> l :<-|: u
_ -> error "cannot make an interval with the given bounds"
Congrats! Intervals are an often neglected aspect of many data analysis tasks, and hard to get right due to the many different relationships and cases. I am the author of closed-intervals which neglects the bounds details because in some applications only proper overlap is what matters.
Some questions:
What is the problem that Data.Calendar solves? What is the complexity of creating and querying calendars?
How is an instance like
instance (Ord x) => Ord (Interval x)
supposed to work lawfully, when intervals don’t form a total order? How does Map (Interval x) in Layers behave, then? Are any overlapping intervals equal? Please provide documentation and examples!
Here is a suggestion to structure Adjacency using modal logic, which can be applied to any sets. Given any relation r :: a -> b -> Bool between elements, define relation liftings
forall = flip all -- supremum in the lattice Down Bool
exists = flip any -- infimum in the lattice Down Bool
hoare :: (a -> b -> Bool) -> Set a -> Set b -> Bool
hoare r a b = forall a (\x -> exists b (\y -> r x y))
smyth :: (a -> b -> Bool) -> Set a -> Set b -> Bool
smyth r a b = forall b (\y -> exists a (\x -> r x y))
egliMilner :: (a -> b -> Bool) -> Set a -> Set b -> Bool
egliMilner r a b = smyth r a b && hoare r a b
For example, isSubsetOf is modelled by smyth (==) and Before is modelled by egliMilner (<).
The Hausdorff distance is also a lifting similar in structure to egliMilner, actually subsumes it:
dHaus :: (Ord distance) => (a -> a -> distance) -> Set a -> Set a -> distance
dHaus d a b = max (smyth' a b) (hoare' a b) where
smyth' a b = supremum b (\y -> infimum a (\x -> d x y))
hoare' a b = supremum a (\x -> infimum b (\y -> d x y))
egliMilner r = dHaus (\x y -> Down (r x y))
Thanks for the interest and for sharing closed-intervals!
To answer your questions:
Data.Calendar is meant for performing calculations based on the number of overlapping events are occurring. For example, calculating the share of a client account that is split between different sales representatives over time.
I give the implementation
instance (Ord x) => Ord (Interval x) where
compare :: (Ord x) => Interval x -> Interval x -> Ordering
compare i1 i2 = on compare lower i1 i2 <> on compare upper i1 i2
which is indeed EQ on Identical intervals. This ordering allows sorting in such a way that algorithms can take advantage of the adjacency of subsequent intervals, which I do in the internals of Layers. The nestingsAsc function combines overlapping intervals in such a way that the resulting list is disjoint and also sorted according to this order.
I have indeed defined the function
hausdorff :: (Ord x, Num x) => Interval x -> Interval x -> Maybe x
hausdorff i1 i2 = case adjacency i1 i2 of
Before (_ :---: a) (b :---: _) -> levMaybe $ liftA2 (-) b a
After (_ :---: a) (b :---: _) -> levMaybe $ liftA2 (-) b a
_ -> Just 0
where
levMaybe = foldLevitated Nothing Just Nothing
It seems simpler to me, this way, than translating hoare and smyth into the language of Intervals that this library uses.
Of course there must be a shortcut for intervals over general sets. I only mentioned it because in some situations, users might not wish to distinguish between all constructors of Adjacency, but instead question a modality, such as “Is there a point within x that is less than/greater than all points in y?” that is witnessed by more than one constructor.
So compare i1 i1 == EQ if and only if adjacency i1 i2 == Identical i1?, Which of the other 12 cases of Adjacency count as LT and which as GT, then? Does the Ord instance satisfy reflexivity, transitivity and antisymmetry? I see no tests regarding that.
But you do have to take into account that the behaviour of Data.Map (used in Layers) is undefined if used with a type that has a non-lawful Ord instance. Nasty bugs can arise, such as
inserts unexpectedly dropping elements
look-ups not turning up elements that are present
In closed-intervals I implemented an interval tree, which decorates each internal node with the convex hull of the sub-tree. You might also want to take a look at sortByRight and and NonNestedSeq which is a simple search structure that works correctly given a certain nesting assumption.
I propose all authors of interval packages should get together and see if we can re-factor the Haskell interval ecosystem into thoroughly tested parts that can be used across several packages. For example:
One common package defining interval bounds (inclusive, exclusive, infinity)
One package defining an interval class or a hierarchy thereof. Packages such as intervals or time with essentially monomorphic types implement the class.
One package defining efficient and correct algorithms on top of the interval class, akin to what parser-combinators does for applicative parsers. A performant implementation of interval trees would be nice.
Perhaps a package or classes dedicated solely to time intervals is warranted. For example, I often work with time intervals of statically known length and therefore have a Data.Interval.Time module that is written in the spirit of Data.Fixed. This could become a part of @mchav 's dataframe effort, since data that is annotated with time types in reality often refers to intervals of time. Examples from the wild:
(timestamp, value) might describe that value is valid from timestamp (inclusively) until the next time stamp in the series (exclusively), if it exists, or infinity otherwise.
(timestamp, aggregation) might indicate that the integral/average/maximum/minimum of a time-varying value over an interval (timestamp,timestamp+implicitLength) resp. (timestamp-implicitLength, timestamp) is equal to aggregation. Parsers should therefore decode the timestamp value into a type that reflects the implicit interval extent.
Some types in the time library describe intervals of LocalTime, eg. Month and the associated DayPeriod class.
All three examples result in collections of intervals that are non-properly overlapping, which makes sorting and storing them easier.
The residue each author remains responsible for will then be special data structures for storing collections of intervals and/or algorithms for interval types that have additional properties, such as calendaric events or intervals of numeric values.
A non-lawful Ord instance would probably be useless for Intervals anyway, considering they perform comparisons when using the constructor patterns.
The ordering of Interval x is in fact the lexicographic ordering on pairs of x, with the addition that Min is less than Inf and Sup is less than Max (c.f. compareBounds). Testing this property to follow.
Good point: The interval-using community should discuss what uses there are for “non-normalized” intervals, as you call them, and how libraries should treat them. Guarding every function against lower x > upper x can lead to complicated interfaces, silent bugs or run-time crashes, depending on the choice of the library author.
In what circumstances is it suitable to keep computing with such an interval and what operations still turn up sensible results? For example, concerning overlap queries any such interval could be regarded as either the empty set (see Python ranges) or as the disjoint union of two rays. It seems that @mixphix calculates
So (4 :||: 0) and (0 :||: 4) are treated identical. But this is only because the pattern :||:swaps bounds if they are not in order. I tried to mentally trace the execution in the source code of adjacency for the un-swapped bounds and determined the (LT,GT) case to match, because
It might be worth mentioning in the documentation that all exported patterns will swap bounds if necessary.
Can you please explain what the open-open constructor pattern :<>: does in case the bounds are identical? The source reads almost as if 0 :<>: 0 yields a closed singleton interval, while I would expect that
That’s correct, if the lower and upper bounds are equal then each constructor will emit a point interval. There is no such thing as an “empty” interval, only an empty Borel set. I will update the documentation to include these points of interest.