Tips on code sample?

Don’t know if asking for advice on the below code is too big an ask or not but thought I would ask. Even a tip from a quick glance or comment on one function would be appreciated.

This is a modified challenge from a book. The book uses semigroup and monoid and I would guess that is the best way but as a challenge to myself I wrote it without. The book also tended to pull values out of their types and put them back in. I read it is best to not do that so I tried to avoid.


{-
    Find matching dates in 2 timeseries and show difference in values
    Uses `Int` as proxy for date.
    Usage:
    Load into GHCi and then
    > printIt (findDiffs file2 file3)
    (11,201.6), (11,201.6) | 0.0
    (12,201.5), (12,199.5) | 2.0
    (13,201.5), (13,197.5) | 4.0
    (14,203.5), (14,197.5) | 6.0
-}

{-# OPTIONS_GHC -Wno-unrecognised-pragmas #-}
{-# HLINT ignore "Use newtype instead of data" #-}
import Data.List (intersectBy)

file1 :: [(Int,Double)]
file1 = [ (1,200.1), (2,199.5), (3,199.4)
        , (4,198.9), (5,199.0), (6,200.2)
        , (9,200.3),(10,201.2), (12, 202.9)]

file2 :: [(Int,Double)]
file2 = [(11,201.6), (12,201.5), (13, 201.5)
        ,(14, 203.5), (15, 204.9), (16, 207.1)
        ,(18, 210.5), (20, 208.8)]

file3 :: [(Int,Double)]
file3 = [(10, 201.2),(11, 201.6),(12, 199.5)
        ,(13, 197.5),(14, 197.5),(17, 210.5)
        ,(24, 215.1),(25, 218.7)]

file4 :: [(Int,Double)]
file4 = [(26, 219.8), (27,220.5),(28,223.8)
        ,(29, 222.8), (30, 223.8),(31, 221.7)
        ,(32, 222.3), (33, 220.8),(34, 219.4)
        ,(35, 220.1),(36, 220.6)]

data Time a = Time (Int,Double)
data Times a = Times [Time a]

fileToTimes :: [(Int,Double)] -> Times a
fileToTimes file = Times (map Time file)

instance Show (Time a) where
  show (Time (t,v)) = "(" ++ show t ++ ", " ++ show v ++ ")"

instance Show (Times a) where
  show (Times time) = concatMap (\(Time t) -> show t) time

compareDates :: Time a1 -> Time a2 -> Ordering
compareDates (Time t1) (Time t2)
        | date1 > date2 = GT
        | date1 < date2 = LT
        | otherwise = EQ
  where date1 = fst t1
        date2 = fst t2

areDatesEqual :: Time a1 -> Time a2 -> Bool
areDatesEqual time1 time2 = compareDates time1 time2 == EQ

makeDiffs :: Times a -> Times a -> [(Time a, Time a)]
makeDiffs (Times time1) (Times time2) = zippedSets
  where matchedDates = intersectBy areDatesEqual time1 time2
        dates = map (\(Time t) -> fst t) matchedDates
        set1 = filter (\(Time t) -> fst t `elem` dates) time1
        set2 = filter (\(Time t) -> fst t `elem` dates) time2
        zippedSets = zip set1 set2

-- Creates one row such as: "(1,2.5), (2,3.5) | -1.0"
makeLine :: Time a -> Time a -> String
makeLine (Time time1) (Time time2) = strFinal
  where strTime1 = show time1
        strTime2 = show time2
        diff = snd time1 - snd time2
        strFinal = concat [strTime1, ", ", strTime2, " | ", show diff, "\n"]

printIt :: [String] -> IO ()
printIt x = putStrLn (concat x)

findDiffs :: [(Int, Double)] -> [(Int, Double)] -> [String]
findDiffs f1 f2 = lines
  where series1 = fileToTimes f1
        series2 = fileToTimes f2
        diffs = makeDiffs series1 series2
        lines = map (\diff -> makeLine (fst diff) (snd diff)) diffs

Some miscellaneous notes:

data Times a = Times [Time a]

I really do not fancy this, in real life you would need to rederive Functor, etc. instances without gaining any additional clarity from the types. [Time a] in signatures is fine.

{-# HLINT ignore "Use newtype instead of data" #-}

“Use newtype” is actually a good hint, don’t see the need to suppress it.

compareDates :: Time a1 -> Time a2 -> Ordering
areDatesEqual :: Time a1 -> Time a2 -> Bool

These are fine but then why not writing Eq and Ord instances for Time? Then you can avoid intersectBy. (edit: Eq instance comparing dates is not useful)

dates = map (\(Time t) -> fst t) matchedDates

If you declared the accessor, like

data Time a = Time { tClock :: Int,
                     tValue :: a}

you could have used those, otherwise you could have pattern matched on t (i.e. \(Time (t,_) -> t).

You kee repeating (\(Time t) -> fst t, means that could have been abstracted.

In makeDiffs I suspect you could have used (for better clarity) intersectBy :: (a -> a -> Bool) -> [a] -> [a] -> [a] or similar.

lines shadows a common function from Prelude.

addendum: Time a has a useless type parameter.

2 Likes
import qualified Data.Function as F
import qualified Data.List as L

-------------------------------------------------------------------------------
-- Input

file2 :: [(Int,Double)]
file2 = [(11,201.6), (12,201.5), (13, 201.5)
        ,(14, 203.5), (15, 204.9), (16, 207.1)
        ,(18, 210.5), (20, 208.8)]

file3 :: [(Int,Double)]
file3 = [(10, 201.2),(11, 201.6),(12, 199.5)
        ,(13, 197.5),(14, 197.5),(17, 210.5)
        ,(24, 215.1),(25, 218.7)]

-------------------------------------------------------------------------------
-- Types

data Time a = Time { tClock :: Int,
                     tVal :: a }
    deriving (Eq, Ord)

ptimes :: [(Int, Double)] -> [Time Double]
ptimes f = map (uncurry Time) f

instance Show a => Show (Time a) where
  show t = "(" ++ show (tClock t) ++ ", " ++ show (tVal t) ++ ")"

instance Functor Time where
    fmap f (Time c v) = Time c (f v)

-------------------------------------------------------------------------------
-- Program

main :: IO ()
main = do
        let ds = diffs (ptimes file2) (ptimes file3)
            ss = map pdiff ds
        putStrLn (unlines ss)

-- ordered list of `Time` having the sale tClock
diffs :: [Time Double] -> [Time Double] -> [Time (Double, Double)]
diffs ts1 ts2 =
        let ts = L.sort (ts1 ++ ts2)
            gs = L.groupBy (F.on (==) tClock) ts
            fs = filter ((==2) . length) gs
        in map df fs
    where
          df :: [Time Double] -> Time (Double, Double)
          df [t1, t2] = Time (tClock t1) (tVal t2, tVal t1)

pdiff :: Time (Double, Double) -> String
pdiff td = show (fmap fst td) ++ ", " ++ show (fmap snd td) ++
             " | " ++ show (tupdiff $ tVal td)
    where
          tupdiff (i1, i2) = i1 - i2

My stab at it, though I suspect there is a clearer way to approach this.

1 Like

To basically repeat a point from above, you could have:

data Time a = Time (Int,Double)

or better

newtype Time = Time (Int, Double)

The a parameter is not being used.

Also, you have a Show instance for Times, but I’d be inclined to comment that out, and just defined

type Times = [Time a]

If you do this, Haskell will automatically derive a Show instance, because in general Show a => Show [a].

1 Like

I’ve made two passes at simplifying the example. One more broadly to remove some redundancies. The other is more aggressive about inlining definitions, deviating from your initial style to a bit of a more dense approach around tuples.

I’ve added my own comments and removed the original ones to avoid confusion.

-- removing type parameter `a`
data Time = Time (Int,Double)
-- alias Times 
type Times = [Time]

fileToTimes :: [(Int,Double)] -> Times
                 -- remove Times constructor
fileToTimes file = map Time file

instance Show Time where
  show (Time (t,v)) = "(" ++ show t ++ ", " ++ show v ++ ")"

-- Show Times is equivalent with the previous Show Times definition; and no
-- longer necessary.


-- deriving `Eq` for Time, thus `areDatesEqual` and `comparaDates`
-- functions are no longer needed. (Ord was only used to check equality)
instance Eq Time where
  (Time (date1,_)) == (Time (date2,_)) = date1 == date2


makeDiffs :: Times -> Times -> [(Time, Time)]
makeDiffs time1 time2 = zippedSets
                       -- replaced with `==`
  where matchedDates = intersectBy (==) time1 time2
              -- use more pattern matching to get rid of `fst`
        dates = map (\(Time (date,_)) -> date) matchedDates
        set1 = filter (\(Time (date,_)) -> date `elem` dates) time1
        set2 = filter (\(Time (date,_)) -> date `elem` dates) time2
        zippedSets = zip set1 set2


makeLine :: Time -> Time -> String
         -- previously it wasn't using the Show Time instance because
         -- time1/time2 were pointing at the tuple (Int,Double) directly
makeLine time1@(Time (_,snd1)) time2@(Time (_,snd2)) = strFinal
  where strTime1 = show time1
        strTime2 = show time2
             -- got rid of snd calls
        diff = snd1 - snd2
        strFinal = concat [strTime1, ", ", strTime2, " | ", show diff, "\n"]

printIt :: [String] -> IO ()
printIt x = putStrLn (concat x)

findDiffs :: [(Int, Double)] -> [(Int, Double)] -> [String]
findDiffs f1 f2 = lines
        -- inline series1/series2 definitions
  where diffs = makeDiffs (fileToTimes f1) (fileToTimes f2)
                  -- uncurry makeLine == \(a,b) -> makeLine a b
        lines = map (uncurry makeLine) diffs

And the further changed code.

-- alias Time
type Time = (Int,Double)
type Times = [Time]

-- fileToTimes function no longer necessary, equals `id` function
-- when aliases are used

-- No longer defining a Show Time instance

-- No longer defined Eq Time instance


makeDiffs :: Times -> Times -> [(Time, Time)]
                    -- inline `zipSets`
makeDiffs time1 time2 = zip (membersMatchedDates time1) (membersMatchedDates time2)
  where matchedDates = intersectBy (\(date1,_) (date2,_) -> date1 == date2) time1 time2
              -- simply pattern matching. (Time (date,_)) -> (date,_)
                        -- reuse filter operation across both times
                                                  -- inline `dates` definition
        membersMatchedDates time_in = filter (\(date,_) -> date `elem` (map fst matchedDates)) time_in


makeLine :: Time -> Time -> String
              -- simply pattern matching
  -- time1/time2 labels are no longer needed as we unpack the tuples
makeLine (fst1,snd1) (fst2,snd2) = strFinal
  where strTime1 = concat ["(", show fst1, ", ", show snd1, ")"]
        strTime2 = concat ["(", show fst2, ", ", show snd2, ")"]
                                                            -- inline diff calculation
        strFinal = concat [strTime1, ", ", strTime2, " | ", show (snd1 - snd2), "\n"]

printIt :: [String] -> IO ()
printIt x = putStrLn (concat x)

findDiffs :: [(Int, Double)] -> [(Int, Double)] -> [String]
                                 -- inline calls
findDiffs f1 f2 = map (uncurry makeLine) $ makeDiffs f1 f2
            -- with aliases fileToTimes x = id x thus calls removed
1 Like

@f-a:

I’m not familiar with [Time Double] syntax and haven’t been able to discover what it is.

[a] is a list of as; [Time Double] is a list of Time Double; Double is the parameter of Time a (just like you can have Maybe a, Maybe Int, Maybe Char, etc.).

Can I ask you again which book/resource are you using to learn Haskell? Having a solid understanding of algebraic data types and type parameters is very very important. Do you feel you understand why the a in your Time a type is reduntant?

1 Like

I didn’t see why the a in Time a wasn’t used until I went to refresh my mind on functor (since you implemented an instance of it) and saw the text, “parameterized types (types that take another type as an argument)”, which Time doesn’t take another type as a parameter so that explains it (as far as I can tell).

My understanding of algebraic data types and type parameters is far from “firm”. I haven’t learned all of them and I have to keep reviewing them as they haven’t stuck in my mind yet.

As far as resources:

  • Get Programming with Haskell - I’m working through carefully
  • Haskell Programming from First Principles - I got lost at one point and decided to try something else. Hope to go back to it.
  • Haskell in Depth - I was doing this with a reading group but it was a bit over my head so I was putting in crazy amounts of time in order to contribute to the group. More time than I had so I’m saving it for later.
  • Lots of Hoogle

Those are the major ones. I’m anticipating you suggesting I learn algebraic data types better first - do you have a suggestion?

1 Like

Either of GPWH or HPFFP will do fine, and if you need some more material a good post on ADTs is this one by Gints Dreimanis.

Keep learning and keep asking question, you will become proficient in no time!

1 Like

To clarify what [Time Double] is, it’s the type that a value like [Time (3, 5.4), Time (2, 1.2)] would have in your code. Does that make sense?

In general, you can think of a type as the space where a value lives. So “blah” lives in the String space, or equivalently, has type String. ["blah", "blah"] lives in the space of lists of strings, which in Haskell we conventionally write as [String].

You could imagine a convention in which we wrote the type of lists of strings as List(String), but this isn’t the Haskell convention.

It’s worth noting something potentially confusing, by the way. ["blah", "blah"] is a value, but [String] is a type. So square brackets around a value indicate a list containing that value, and square brackets around a type a indicate the type of lists containing as. It’s a bit of a “pun”, because usually the names for types and values are kept separate (like Just being a value and Maybe being a type).

1 Like

Wow, lots of good stuff - thank you all! I’m still working through it.