Long function names for real-world code

TLDR; I wondering if someone could comment on the strategy of choosing long variable/function names.

As a hobby, I compose music in classical styles, and I also have an interest in using software written in Haskell to help me compose. This may not be the most natural application for Haskell as there is a lot of configuration, prototyping, complex algorithms that involve a lot of practical configuration and experimentation, and so on. Maybe a scripting language? But Haskell is fun!

For readability, I’m experimenting with longer function and variable names. I’m also making functions as short as I can, which requires more helper functions, which need a clear name that distinguishes them from other similar helper functions.

The Haskell I see in books and exercises is often very compact with short function names and every bit of code is elegant. But, I find that short function names in a music application leads to confusion later.

But I’d like to know what others think, as super long functions names can be unwieldy, too. Here’s an example: (Cac is a state/exception monad which contains a pseudorandom generator state. This function is meant to choose specific compositional choices such as notes or rhythms from among a set of choices that have been evaluated for fitness with some randomness in the specific choice. A ShowItem is a kind of “report” that can be arranged in a hierarchy and I have other code to manipulate these reports, used here for getting feedback on what the algorithm is doing.)

chooseRandom :: forall a. ChoosingConfig -> [(a,FitnessResult)] -> Cac (a,ShowItem)
chooseRandom config choices = do
  (topScore,sorted) <- sortFitnessResultsByScore choices
  let groups = groupChoicesByScore sorted
      bestChoices = takeUpToNFromGroups (view numChoices config) groups
      scoreIsCloseToTop (_,s) = fitnessResultScore s >= topScore-view maxTopDeltaFitness config
      topChoices = takeWhile scoreIsCloseToTop bestChoices
      report = constructShowItemFromChoices "chooseRandom" topChoices
  finalChoice <- randomChooseFromList (map fst topChoices)
  return (finalChoice,report)
5 Likes

This applies to all languages.
The good thing about no names is the effort you save: naming things is hard.
The bad thing is coming back later.
Sometimes things do not need a name. Math has symbols which have succint and unmistakable meaning.

((one `plus` two `times` three) `times` four) `toPowerOf` oneThousandTwoHundredSeventeen

((1 + 2 * 3) * 4) ^ 1217

The good thing about long names is that you don’t have to relearn and decipher wingdings.
The bad thing is reading filterAfterFlatmappingInCombinationWithCacGeneratedFitnessMonoid takeUpToNFromGroupsAfterRandominglyShufflingTheResultAfterDeletingTopFitnessScoresWithinDelta. Just like regular text, no spaces or commas make it exhausting. Functions with 20 arguments are a never ending sentence.

TLDR; Be concise.

3 Likes

At work we mostly follow German Naming Convention, which I’m pretty happy with.

9 Likes

Looks good to me.

Nitpick about sortFitnessResultsByScore: perhaps it should be called sortOnFitnessResultScore to match the “on” convention?

About view maxTopDeltaFitness config and fitnessResultScore s. Are these fields? Perhaps using OverloadedRecordDot could let you write things like config.maxTopDeltaFitness and s.fitnessResultScore.

About

A ShowItem is a kind of “report”

Perhaps the type could be directly called Report.

About constructShowItemFromChoices. If ShowItem happens to live in its own ShowItem module, I would rather write ShowItem.fromChoices, perhaps even unqualified if there’s no ambiguity.

3 Likes

I would probably do shorter names, but only a bit shorter. Some of them imply they could be factored out some, e.g. sortFitnessResultsByScore might be List.sortOn (scoreOf . snd) choices or groupChoicesByScore might be List.groupOn scoreOf choices. And I use qualified names, so e.g. randomChooseFromList might be Random.choose

I carry around a Lists module with a bunch of generic utilities. Data.List is ok but it’s missing a lot, e.g. its groupBy is pretty inconvenient. Grouping and sorting I find very common. Same for Maps and Sets but really it’s all about the lists!

BTW I am also working on a music composition related project in haskell, but probably in a completely different way. One thing is that in addition to the implementation language, haskell, there are also score languages, which are either haskell EDSLs or entirely separate languages, and score languages heavily emphasize concision, because they occupy some point in between highly structured code and “just data,” it’s the configuration problem all over again. So e.g. tr 1d 7t for trill at one diatonic step with 7 oscillations per score time unit (as opposed to real time second) and things like shape and direction inherited dynamically, or in haskell, tri_ dim (su (p5.p6.p7)) for triple separated by dim stroke of double speed 5 6 and 7 patterns. That is sort of the opposite the verbosity which can be helpful in “programming” style code.

Since you mentioned randomness, I also have a specific approach, where it’s seeded by the “call stack” within the score, so you don’t get the thing where you add one random call and everything downstream changes (that also forces a threaded data dependency across the whole score, which is bad for parallelism) And then, if you get a variation you like, you can pin it in place by directly overriding the seed to whatever it is, and it will remain the same even when moved.

4 Likes

Please keep doing what you are doing. Names are statically checked documentation, and «programs must be written for people to read» (page xxii). Sincerely, reading code like this makes my day brighter.

5 Likes

Is “Cac” a meaningful name?

“Cac” is one of a very few names used abundantly all over the code. It stands for Computer Assisted Composition. Seems okay to me to make a name like this an acronym.

Report sounds good. However fromChoices living in ShowItem.hs would create a cyclic dependence. I’m not sure if that’s allowed.

I didn’t know about OverloadedRecordDot. That is very cool. A bit of object oriented syntax.

I think the main issue I see is…

  1. for novices in the code/domain, the information will still not be enough to grok the “algorithm”
  2. for experts, it’s probably too noisy already

Write code for experts, write documentation for novices.

3 Likes

Well, the domain is experimental often temporary algorithms for making music according to my own experiments, and who I’m writing for is me six months from now. I’ve tried shorter names in the past and it definitely takes a lot of time to get up to speed when I come back after 6 months. However, it could still be made more clear, potentially, without longer names, via refactoring.

When reading other people’s code, I found the following combination a good trade-off between succinctness and clarity:

  • qualified imports
  • concise names in imported modules (like @elaforge) suggested
  • Haddock-generated source annotation that gives you the type and module as tool-tip and hyperlinks

You then have the freedom to pick a namespace for the qualified import that best explains your usage, e.g. when using GHC.Exts.sortWith and nothing else from that module, you could say

import qualified GHC.Exts as SortList
... SortList.sortWith ...

Some library authors take this to the extreme and name all their types T and all their classes C which makes only sense if everything is imported qualified and resides in its own module.

Finally, I’d like to promote provenience, which solves the problem of documenting which intermediate functions were called to construct a value. It is a monad transformer that automatically constructs a data-flow graph at run-time. Your large let-blocks would become monadic do-blocks with explanations sprinkled in.

5 Likes

I aim for code that is as easy to parse at a glance as possible given the context. People are good at tracking context without needing to think about it, so it’s a good thing to rely on.

Riffing off your original example: if sortFitnessResultsByScore is going to be used all over the code outside its original context, I’d lean towards a long name like you’ve chose. But if it’s going to be used in, say, the part of the code that is all about fitness results, I’d probably call it sortByScore. And if it’s a helper in a where clause, I might just call it sort.

I also generally think of the type of a function as implicitly part of its context. Unless I had a module with a bunch of randomChooseFrom* functions that needed to be disambiguated, I would call randomChooseFromList either randomChoose or even just choose. The fact that the function is choosing from a list is not core to its semantics—it’s an implementation detail—and it’s clear from the type. The fact that it is random is key to its semantics, but it’s also in the type and, generally, clear from context.

The upside with shorter names is not that they’re easier to type but that they make the code easier to skim or to visually scan. You read code more often than you write it, but you skim code even more often than you read it! And as long as you align your naming choices with context—hard to demonstrate on a small self-contained example, unfortuantely!—the code becomes somewhat easier to follow even on a detailed reading.

So, looking at your code, I would make most of the names shorter, but I would make Cac longer (or, at least, more explicit) since it’s a type that is presumably used widely throughout the code. (Then again, if it’s used super widely, the cost of learning what the acronym is used for gets amortized pretty quickly!)

5 Likes

When reading other people’s code, I found the following combination a good trade-off between succinctness and clarity:

  • qualified imports
  • concise names in imported modules (like @elaforge) suggested
  • Haddock-generated source annotation that gives you the type and module as tool-tip and hyperlinks

100% agree with this advice. Writing modules with the assumption they will be imported qualified made a big improvement to my Haskell code.

It’s a great tool for reflecting context in the way I was talking about: in the context of the module itself, you use shorter names because it’s clear what the module is about; outside the module, you might have less context, so you would import the module qualified, with the module name providing context to the short identifier. It’s a win-win.

1 Like

Thanks, everyone! This is some good stuff. I’m going to refactor my code to use qualified imports and shorter names, like you suggest. You’ve inspired me in other ways to refactor the code and realize where names can be shortened when context is not needed.

1 Like

I also suggest to not think too hard about your code.

The realization that you could have done something better/different often comes years after a project… when you look back and find your own decisions questionable.

That’s fine.

1 Like

@hasufell Good idea, but also this is a purely fun project which is a good testing ground for experimentation and learning. Productivity is saved for those parts of my life I get paid for. :wink:

EDIT: also this project is just beginning so refactoring will take an hour or two at most.

2 Likes