(Not) the Haskell Foundation DevOps Weekly Log, 2022-06-03

Hello again! Unfortunately this post is a stub, since I was delirious with the flu for most of this week.

Not all was lost. My MR for capturing eventlog data is now ready for merge, and I started looking at the code that populates the performance metric database. Next week, I’ll be extending that code to process the new eventlog data. Once that pipeline is done, I’ll be turning my focus to CI stability issues.

In other news, I (finally) signed up for Zurihac, so maybe I’ll see you there in a week!

10 Likes

A long outstanding issue with performance metrics is that GHC does not test boot / core libraries. Would it be possible to integrate their benchmarks into performance dashboard?

3 Likes

I have no idea what the eventlog does, if you feel like providing an explanation for outsiders. (Lately, my brain instantly thinks why aren’t traces and spans used whenever I hear the words event or log).

2 Likes

The eventlog is an option in the runtime system that enables logging of events from the runtime system like when new threads are created and also garbage collection events. Futhermore you can manually log things yourself. There are tools to process these logs and visualize timings and memory usage statistics.

See also:

3 Likes

The eventlog is a stream of data printed to a location of your choice—usually a file—by the GHC runtime. It’s binary data that can be read by a zoo of different tools that are more-or-less useful for different performance-improvement tasks.

A few other links:

ThreadScope features in Simon Marlow’s book Parallel and Concurrent Programming in Haskell, which is probably where I first heard about the eventlog.

The main use case for the eventlog seems to be evaluating the behavior of parallel programs, but it also dumps data about GC events. In particular, it outputs three types of data about heap size. Unfortunately, I can’t find any good explanation about what those data are! The little blurbs I’ve found aren’t very revelatory. My notes so far:

  • 15. Eventlog encodings — Glasgow Haskell Compiler 9.2.2 User's Guide
    • Heap size, blocks size, and live:
      • A HEAP_SIZE event will be emitted giving the current size of the heap, in
        bytes, calculated by how many megablocks are allocated.
      • A BLOCKS_SIZE event will be emitted giving the current size of the heap, in
        bytes, calculated by how many blocks are allocated.
      • In the case of a major collection, a HEAP_LIVE event will be emitted
        describing the current size of the live on-heap data.
  • -? What is a megablock?
    • 11. Hints — Glasgow Haskell Compiler 9.2.2 User's Guide
    • Memory is allocated firstly in the unit of megablocks which is then further divided into blocks. Block-level fragmentation is how much unused space within the allocated megablocks there is. In a fragmented heap there will be many megablocks which are only partially full.
    • rts/include/rts/Constants.c:
        /* The size of a block (2^BLOCK_SHIFT bytes) */
        #define BLOCK_SHIFT  12
      
        /* The size of a megablock (2^MBLOCK_SHIFT bytes) */
        #define MBLOCK_SHIFT   20
      
    • so maybe size in megablocks = amount actually requested from OS, size in blocks = amount actually allocated to data, live size = amount in blocks of live data?
        rts/gen_event_types.py:
      
        #EventType(49, 'HEAP_ALLOCATED',   [CapsetId, Word64],             'Total heap memory ever allocated'),
        #EventType(50, 'HEAP_SIZE',        [CapsetId, Word64],             'Current heap size (number of allocated mblocks)'),
        #EventType(51, 'HEAP_LIVE',        [CapsetId, Word64],             'Current heap live data'),
      

To be honest, I think the eventlog is woefully under-documented and under-utilized. :slight_smile:

6 Likes

I’ll add it to the list :slight_smile: