Doing `tail -f` in Haskell

In a Haskell web application I’m working on there’s a need to stream various log files to the browser (see demo below). I’ve been using tail -f along with stm-based broadcasting to subscribers with success, and figured it is worth releasing it as a new library:

Demo in a htmx app

Future Improvements

  • Switch from tail -f to reading the log file natively in Haskell
    • I did try this to begin with, but it lead to blocking elsewhere and I haven’t had the opportunity to thoroughly investigate it yet (but I suspect it is related to file handles).
12 Likes

There’s also the tailfile-hinotify package, which may or may not be bitrotten. It could have a new version that ditched some unnecessary dependencies on streaming libraries and added some new functionality, like tailing several files at once.

1 Like

A simple tail-forever with multiple consumers can be written like this with streamly-process (two consumers example - stdout and an output file):

{-# LANGUAGE QuasiQuotes #-}

import Data.Function ((&))
import Streamly.Unicode.String (str)
import qualified Streamly.Console.Stdio as Stdio
import qualified Streamly.Data.Stream as Stream
import qualified Streamly.Data.Fold as Fold
import qualified Streamly.FileSystem.File as File
import qualified Streamly.System.Command as Command

main = do
       Command.toChunks [str|tail -f "input.txt"|]
     & Stream.fold (Fold.tee Stdio.writeChunks (File.writeChunks "output.txt"))

You could potentially have concurrent consumers as well using concurrent folds.

4 Likes

@harendra I’ve been meaning to checkout streamly. How would that code look if you were to support ‘recent N logs’ for new subscribers? i.e., in the middle of the log file being updated, if a new subscriber comes in, they get the previous N lines along with new lines.

System.Tail maintains a ring buffer for this purpose.

I’m also curious how streamly would look when used in the larger application (we heavily use stm in process supervision as well), but I guess that’s a topic for separate thread.

It will not be as simple as above but not too complicated. I will refer to the master branch API here, to be released in a few days time. Streamly has a RingArray type. The sketch of the program will look like this – the tail output is streamed and then scanned to write it to the ring array as well, the output stream consists of the ring array (old lines) as well as the new lines as an array. After this there are two possible ways - we can flatten the RingArray and the new lines into the output stream when a new consumer joins. Or we can just output both, and let the consumer choose to use old and/or new any time.

Streamly has facilities to dynamically add new concurrent scans or folds to an existing stream, this is essentially subscribing new consumers. See the Streamly.Data.Scanl.Prelude and Streamly.Data.Fold.Prelude modules on the master branch.

2 Likes