[Survey] Which low-level details would like to see documented?

Hi everyone,

The machine never stops. I am looking for things to document in the base or core libraries.

I recently contributed this diagram of the memory overhead of a ByteArray:

Is there any other data structure for which you would like readily accessible information about their memory representation? This is also the opportunity for People Who Know to make sure the knowledge is transmitted, and contributors of such MRs to learn more.

For those who will attend European Haskell Hackathons like MuniHac and ZuriHac, it’s also the opportunity to pair program and be guided with the development workflow of GHC & Haddock.

Cheers,
HĂ©cate

10 Likes

I’d much rather prefer there be a single concise source on the general internal representation of things within the language, then every primitive (including ByteArray#) can be documented relative to that. You don’t need images everywhere that way, a simple “MutableByteArray# is a pointer to a block of memory together with its size” already delivers 90% of everything I care about.

Same regarding laziness/levity.

This would also create a downstream effect where I don’t have to bother with complex explanations in libraries, I can just point to a known piece of documentation for terminology explanations.

2 Likes

I’d like to nominate:

@jaror thanks! You don’t feel like the Implementation sub-header of List fulfils its role?

Okay so you’re more in favour of a centralised source for these explanations. Fair enough, I have something planned for the wiki to host such a page. Any data structure that you’d like to shortlist which isn’t the same shape as something already documented?

1 Like

None I can think of: all other primitives I use follow the same general structure, low-level arrays are the only special ones.

2 Likes

For a while I’ve wanted to write a (sub) section in the user guide about what heap profiles mean, which requires explaining a few things about how heap objects are represented at run time and how user code is optimised (eg, unpacking fields, etc). And in particular what built-in types mean, eg, ARR_WORDS. Part of this would be integrating bgamari.github.com - The many arrays of GHC into the user guide.
See: #23976: Document meaning of primitive arrays in profiles · Issues · Glasgow Haskell Compiler / GHC · GitLab

I’d be happy to pair on this, or have someone else pick it up.
(I’ll be at ZuriHac, no plans for MuniHac yet)

6 Likes

See also

2 Likes

Yeah it would be nice for GHC to self-document the memory representation

1 Like

I actually had opened a ticket about Array#, I’ll add SmallArray# to it.

1 Like

Ah, I didn’t see that collapsed section with a quick glance at the documentation. It still might also be useful to mention there that each box in the diagram is one word, so the total memory overhead of a fully (spine-) evaluated list of n elements is 3n + 1 words.

Can I let you open a ticket for this?

Done: #24855: Mention exact word counts in implementation documentation of List · Issues · Glasgow Haskell Compiler / GHC · GitLab

1 Like

The mutable references and their boxed wrappers:

  • MutVar#/IORef/STRef
  • MVar#/MVar
  • TVar#/TVar

And perhaps add info pointing to MutableByteArray# as a more efficient alternative to MutVar# for unboxed types.

3 Likes