Hi! I’m calling Haskell/Nix(OS) users to suggest feature ideas for a CLI tool I’m writing to build and deploy distributed fleets of NixOS machines. Provisional name is fleet
, but I’ve just noticed that there are existing tools with the same name so I’m open to suggestions.
In my home I currently have 6 machines running NixOS (workstation, laptop, router, and 3 raspberry pis running a distributed sound system). I love the idea of being able to configure them all in one Nix repository, such that they can reference each other, share code, and access common variables etc. I tried various existing deployment tools: NixOps, Colmena, Bento and Comin to name a few - but found them all sub-optimal in one way or another, at least for my needs.
For example:
NixOps is in “low-maintenance mode”;
Comin works on a pull-based model with I didn’t find very “nix-y”;
Bento also works on a pull-based model, and furthermore is written in Bash which I find slightly unnerving;
Colmena doesn’t support health checks.
Also, none of the above are written in Haskell.
I ended up settling on Colmena, which I believe is the best overall, but still I dreamed of better. So, let’s do it. The idea is to combine all the best features of all the best deployment tools, together with beautiful code written in the best language of them all.
From the Nix side, the design will be very similar to the other tools, effectively “fleet-wide configuration and metadata”, “normal NixOS configuration for each node”, and “misc fun deployment stuff”. The last of these is the part for which I’m asking for suggestions.
I have a list of features in mind, including but not limited to:
- configurable health checks, implemented by a daemon running on the nodes (written in Haskell of course);
- deployment hooks, i.e. arbitrary commands or scripts run on the nodes before/after deployment;
- secrets, i.e. sensitive files copied to the nodes without ending up in the nix store on the node or the deployment client;
- minimising the number of times the user is prompted for a sudo password, by creating a system user on the node with password-less sudo to nix commands;
- automatically configure SSH access between nodes, including generation of key pairs;
- configurable delegation of NixOS builds to specific nodes in or out of the fleet;
I currently have a minimum viable product implementing a few of these features, and I’m working on the rest. I’m aware of the pitfalls of trying to cram too many ideas into one tool, but I feel there’s still some wiggle-room for a few other killer features.
So, if you manage a fleet of NixOS machines and always wished for a way to do a particular thing, or generally you have an idea for the design of this deployment tool, please let me know below! Also if there are any existing tools with great features not on the list above, that’d be useful information too.
If anyone’s interested, I’ll elaborate further on the current design, or post status updates (or health checks!).
Thanks!