Challenge: FeatureFlag pure functions

Follow up to: Weird behavior with bracket and global IORef

We had the below requirements at work and wrote a solution that works decently well. It’s recently started bothering me again, so I’m taking another stab at it. The challenge for you is to do better. A bonus challenge: is there a relatively small change to GHC that we could make someday that would improve this?

For simplicity, assume that this is a greenfield project, so you can architect the entire codebase around this design. The requirements are:

  1. Every change to the code must be feature flagged when first released. This includes bug fixes and refactorings.
  2. Feature flags are fetched at runtime from the database
    • If a refactor breaks prod, you can turn it off immediately without waiting for a rebuild
  3. Unit tests should exercise all branches
    • e.g. disabling a feature flag reverts to the old behavior successfully

The solution we came up with:

  • Fetch flags in main, store in global IORef
  • getFlagIO reads from global IORef
  • getFlag reads from a NeedsFlags constraint which contains the flags
    • We used reflection, but you could also imagine using implicit parameters here
  • Add the NeedsFlags constraint up the call stack to the nearest IO function, which uses a passFlags function to read the global IORef and pass it in the constraint

Solutions we rejected:

  • unsafePerformIO in pure functions - GHC optimizes these out so we cant toggle the flag in unit tests
    • would a hypothetical noupdate function help this? (see linked discussion)
  • pass flags as first arg to every function in the code base - too manual
  • make every function monadic (e.g. ReaderT) - too verbose
  • cpp macros - cant modify at runtime
2 Likes

I want a better solution.
Hence there is a better solution.

Your challenge seems to resemble the “interesting” use of GHC’s feature flags:

…which is slowly being remediated.

Otherwise:

…perhaps something there can be reused for your purposes.

What do you mean by “too manual” here? Surely every single component doesn’t need to know about every single feature flag, and you’re free to transform arguments at will. At some point down below you must have the function that does the wanted thing with some immutable configuration, the entire task is just piping your pile of feature flags to that destination.

I guess you’d want IORefs if you want to hide all that wiring, but I don’t know why you’d want to do that.

So then any “change” to the codebase would involve creating new small functions and branching between old and new definition parts using feature flags.

“immediate” here can mean one of two things:

  • Any external function invoked after the change uses the new set of feature flags, every running one is unaffected;

  • Running functions are affected as well.

I’m thinking in terms of the first one, since I assume the second one could result in weird behaviors in certain side cases.

1 Like

I am forced to do this at work and it’s the most horrible and wrong thing to do.

In half a years time, you’ll have 20 “feature” flags and zero knowledge about how they interact with each other.

You think you can flip one of them in production if a “feature” causes a problem? Guess what, you’re now running an entirely new application that has never been tested.

Despite that, it also causes bad code structure and actively sabotages refactoring.

This is how you end up with an unmaintainable codebase.

6 Likes

@atravers Sadly, yes, it’s bears similarities to DynFlags 🫤 Luckily it’s not quite at that scale though. Most of the flags are bools that fix bugs and should basically only ever be true. But theoretically, yes, exhaustively testing the whole of the application should be tesing 2^N configurations.

The implicit config paper just looks like what the reflection library does. Is there something I’m missing?

@BurningWitness We have a map from flag to value, so we could just add that map as the first argument to every function, even if the function doesn’t use it yet. By “too manual”, I simply mean it’s verbose and annoying to manually thread through everywhere. I did say to assume a greenfield project, but in our actual usecase, it would’ve been infeasible to add an argument to every function in our existing codebase. But I guess theoretically, in a greenfield project, having a convention that every single function should have the first argument be the flags is not terrible.

Unrelated, I wonder how upgrading dependencies work at my work…

And yes, assume that “flipping a flag” only happens if the execution failed and is rerun. Flags are immutable for the life of the program in prod.

@hasufell I’m trying to keep my personal feelings aside. (But I completely agree :upside_down_face:)

If you mean this library:

…it implements the ideas in that paper. (“However, the API has been streamlined to improve performance.” ) So if you’re already using reflection, you’re probably not missing anything from the paper.


Then what purpose does the IORef serve?

pass flags as first arg to every function in the code base - too manual

so we could just add that map as the first argument to every function, even if the function doesn’t use it yet

When starting the application, near main, I would partially apply all functions with the feature flags that are relevant to them. Callers of those functions would receive the already partially applied functions as parameters.

Granted, this would still have a whole lot of parameter-passing. But receiving your direct dependencies as parameters is less dispiriting than receiving some feature flag value that you don’t use directly and it’s only required by the implementation of some dependency far below.

The initial “wiring” of the functions could be done manually, or we could make use of some dependency injection helper library like “registry”.

Edit: about DynFlags in GHC. IIRC, the problem with DynFlags is not really the flags as such, it’s that it’s a monolithic entity containing all the flags that is passed around everywhere, making decoupling difficult because each component is privy to configuration details of other components, and tempted to rely on them, resulting in a big ball of mud. Properly disaggregating DynFlags wouldn’t mean an end to the flags.

1 Like

So … ? There’s a patch of code that in the previous release wasn’t controlled by any flag; it turns out to have a bug; you modify for two things: fix the bug and add a new flag, fetch it, inspect it. The code you insert to fetch that flag: that’s new code – better have a flag for that! (After all, it’ll need to do IO; that might interact nastily with IO for the application proper accessing the database.)

Infinite regress.

Wishful thinking: my programming team are incompetent [**]; please invent magic to prevent them delivering bugs.

[**] Also the test teams are incompetent at detecting whether an allegedly-fixed bug is actually fixed. Also the application support team are incompetent at accurately describing bugs they/the users have detected.

2 Likes

Scrive has documented their way of doing this: Managing change with Rollout Flags


When adding a new rollout flag, do consider if other rollout flags can be removed.

That’s basically the most important thing is, as @hasufell highlights it. Don’t let them proliferate, ensure that they all have an expiration date or that they are all discussed regularly (≠ frequently).

4 Likes

To contrast to GHC’s feature flags …

  • GHC is a compiler, not a user-facing application;
  • GHC is explicitly a tool for experimenting with language features: some features are deemed stable/blesséd; some are deemed bleeding-edge;
  • GHC’s feature flags are at a chunky level, not per-bug-fix;
  • Each flag is introduced as part of the feature development;
  • There is a huge testbase of sourcecode which GHC gave recorded behaviour as at the prior release (and doesn’t ‘know’ about the new feature) – that’s tested to make sure its behaviour is unchanged across releases.

Never the less, I’ve seen quite a bit of discussion to the effect:

  • (Nearly) every new language feature comes with specific syntax;
  • If your sourcecode doesn’t ‘know’ about the feature, it won’t use that syntax, it should behave as before/regression test for that;
  • If your sourcecode does use that syntax, there’ll be a system of warnings: you can choose for the compiler to ignore the warning, or report it and carry on, or treat as error – which’ll be the behaviour when the feature is first released.

On the bright side, it’s an excuse to post this classic.

@atravers how would you do it without IORef? Even if the flags are immutable, you need a way to read them from deep in the application. The top-level global IORef lets you read the flags from anywhere, as long as you’re in IO

@AntC2 Yes, obviously there are some exceptions (you can’t hide changes to the flags framework itself behind a flag, or hide removing a flag behind a flag). But that doesnt mean you shouldn’t flag anything. Protecting 90% of your changes is better than not at all. Even if I think it’s overkill, doesn’t mean there isn’t a certain logic behind it.

I guess one compiler change that would make this easier is to take implicit params even further:

  1. Allow specifying an implicit param as global, meaning it’s automatically added to all functions that’s inferred as needing it
  2. Allow specifying a default param to set to the global implicit param if it gets to main without being eliminated

This way, main would fetch the flags, pass it as an implicit param to the entrypoint, then everything under that entrypoint can read the flags by adding the implicit param, without needing to mark every function along the way. Tests could then set flags when running a function that needs it. Not saying this is a good idea, but it seems general enough.

I may be stating the obvious, or conversely I may be completely misunderstanding, but isn’t the point that the flags are immutable once they’re set. They are set at application start time, so they can’t be pure values.

1 Like

So isn’t the usual “pile-driver” chosen for this situation something starting with unsafe, which does the initialisation at application start time? (Or did I “miss a memo” somewhere?)

Oh sure, you’re saying why not

flags :: Flags
flags = unsafePerformIO $ do
  conn <- initDBConn
  fetchFlags conn

?

Two issues:

  1. Flags are immutable in prod, but not tests, we want to be able to test a given function (pure or IO) with a flag both enabled and disabled
  2. This would initialize the conn twice: once here and once in main. I don’t see a way around that, without storing the conn in another global variable. Which wont work if the db conn needs runtime args to set the host of the db. Or in our case where the conn is passed to main (we have a slightly different usecase than an executable running a main function)

EDIT: just to be clear, this is the current logic:

flagsRef :: IORef Flags
flagsRef = unsafePerformIO $ newIORef defaultFlags

getFlagIO :: Flag -> IO Bool
getFlagIO flag = fmap (lookup flag) . readIORef flagsRef

main = do
  conn <- initDBConn
  writeIORef flagsRef =<< fetchFlags conn

I think you still might be able to do this:

flags :: Flags
flags = unsafePerformIO $ readIORef flagsRef

As long as you only force that after the writeIORef flagsRef, you should be fine.

Oh interesting! Yes, that would work, but then how would tests work? If all the code does the equivalent of

lookup flag flags

The tests won’t have a way to modify those flags

At this point the noupdate function would help, because then you could make it so that the flags are retrieved every time you use them instead of just once.

Perhaps you could already do something like that, let me experiment…

Edit: yes, this seems to work:

{-# OPTIONS_GHC -fno-cse -fno-full-laziness #-}

import Control.Concurrent
import Data.IORef
import System.IO.Unsafe
import Data.Kind

type Flag = ()
type Flags = Bool

flagsRef :: IORef Flags
flagsRef = unsafePerformIO $ newIORef False
{-# NOINLINE flagsRef #-}

getFlag :: Flag -> Bool
getFlag flag = unsafePerformIO $ lookup flag <$> readIORef flagsRef where
  lookup :: Flag -> Flags -> Bool
  lookup () x = x
{-# NOINLINE getFlag #-}

main = do
  writeIORef flagsRef True
  print (getFlag ())
  writeIORef flagsRef False
  print (getFlag ())

Pay attention to these points:

  • Define getFlag in a module with -fno-full-laziness.
  • Use -fno-cse if you want the same getFlag () to mean different things in the same function/module. In your case, I don’t think you’ll need it if you instead use -fignore-interface-pragmas in your tests, that way your code cannot be inlined into your tests.
  • Make sure flagsRef and getFlag have a NOINLINE pragma.

Wouldn’t laziness mean that the subsequent calls to flags reuse the result of the first call?


At a glance…it seems to resemble much of the initial example code from Weird behavior with bracket and global IORef - what’s different?

But more generally - if you really, really think that “top-level mutability” (even “once off” ) is the solution, then look no further than: