So just a few weeks late to the party here, it turns out that between work, the 3.24 Alpine release, and "camp counselling" I left myself very little time for actually working in my legacylab. Fortunately, I think I can get things moving in the right direction in a decent clip once the core work is done.
Now my lab idea requires a hypervisor, and the period appropriate answer is to reach for Hyper-V on Windows 2008 R2, but I honestly don't want to do that yet. It's a good base for what I'm trying to do, but it's not the quick start I really need right now. Instead I'm going to use an Alpine Linux based incus hypervisor, which I'll configure with salt. There's two lines of reasoning here, the intent of llsc (for me) is to learn something I'm only partially exposed to professionally, ergo building on top of systems I know wicked well makes a ton of sense. I want to do the least amount of time administering the underlying hypervisor and the most amount of time tinkering with old windows.
So lets dive into what that looks like.
Why so salty?
Well SaltStack is an objectively good tool for a job like this, I've used it professionally for over 6 years now, I maintain it for Alpine and contribute to it directly. On one hand, why look elsewhere? On the other, it's a capabilities thing. Salt provides a simple abstraction over system configuration that allows me to spin different types of systems up at scale. Treating my entire homelab as this fungible thing I can destroy and recreate from scratch at any time is just wildly helpful. For example, I pulled one of the nodes from my CI/CD cluster out to repurpose for this job, once llsc is over I'll wipe it and redeploy it into service using salt as well. Different states, maybe different names to denote a shift.
The configuration
So at the surface salt states are just yaml with a bit of jinja templating sprinkled in. Underneath the hood there's this whole event system that can make the control plane react to things crossing it, or orchestrate complex interdependent changes spanning an entire fleet, but honestly I need almost none of that here. All this particular job wants is a handful of state files that declare my desired configuration. Could be a single file really, but I like to organize my states by purpose so future me knows exactly where to look.
~/Development/salo/salt/common/incus|>> tree
.
├── config.sls <- Deploys service configurations
├── defaults.yaml <- Defined default variable values
├── init.sls <- Entrypoint that sequences each state in order
├── map.jinja <- Default, local var, pillar merge handling
├── packages.sls <- Installs packages
├── services.sls <- Starts and enables services
└── templates
└── preseed.yaml.jinja <- Incus preseed config template
2 directories, 7 files
It looks like a lot on the surface but it really isn't much more than what you'd cram into a shell script to stand the same thing up. This is going to get a little long since these are the full state files I use to deploy Incus, but it feels worth sharing since I don't have them public anywhere else, and honestly the interesting bit isn't any single file. It's that taken all together they refuse to lie to you. More on that in a second.
So when I'm deploying a new box I either have cloud-init bootstrap salt-minion onto it at deploy time so it joins my master automatically, or I install it myself. Once it's joined to the salt-master I literally only have to run a couple of commands and then I've got a brand new hypervisor. My running joke is that all of this effort exists to defend against all of the coffee runs I make per day. That literally looks like this.
salt-key -a praa <- Accept the minion
salt praa saltutil.sync_all <- Sync salt extensions
salt praa state.apply common/incus <- Apply state configuration
Accept the key, sync the extensions, apply the state, and that's the whole ritual. Everything below is just what makes those three lines enough.
First up is our entrypoint into the incus state, init.sls. This one doesn't really do anything itself, it just sets the running order.
include:
- common.incus.packages
- common.incus.services
- common.incus.config
When salt reads an include like this it pulls the content from each state and renders them top to bottom as a single call. So packages first, because nothing else functions without them, then the services tied to those packages, then the config that assumes both of those already happened. The order isn't decoration, it's kind of the whole point, and you'll see it matter here in a minute.
Then we have packages.sls, which is literally a single call to pkg.installed and a very long list. Not that exciting on its own, and yeah, it could be a shell script or an /etc/apk/world file dropped onto the Alpine box.
incus-packages:
pkg.installed:
- pkgs:
# --- base system / userland ---
- eudev
- coreutils
- findutils
- util-linux
- util-linux-misc
- procps-ng
- shadow
- file
- gawk
- grep
- acpid
- openssl
# --- storage ---
- zfs
- zfs-lts
- zfs-libs
- zfs-scripts
- zfs-udev
- e2fsprogs
- dosfstools
- xorriso
# --- networking ---
- bridge-utils
- openresolv
- iputils
- bind-tools
- curl
- wget
- socat
- iftop
# --- firewall ---
- nftables
# --- remote access ---
- openssh
- mosh
- tmux
# --- virtualization: incus ---
- incus
- incus-agent
- incus-client
- incus-utils
- incus-vm
# --- virtualization: qemu ---
- qemu
- qemu-system-x86_64
- qemu-img
- qemu-tools
- ovmf
- swtpm
- qemu-hw-display-virtio-gpu
- qemu-hw-display-virtio-vga
- qemu-hw-usb-host
- qemu-hw-usb-redirect
- qemu-audio-spice
- qemu-chardev-spice
- qemu-ui-spice-app
- qemu-ui-spice-core
# --- monitoring / logging / management ---
- syslog-ng
- logrotate
- sysstat
- sysfsutils
# --- cli tooling / editors ---
- git
- mg
- htop
- lynx
# --- hardware diagnostics ---
- lshw
- pciutils
- usbutils
Except neither of those does the thing salt does. One of the primary use cases for salt is periodic self reinforcement, so by virtue of using the proper tooling I can re-apply this state, and every state stacked on top of it, to any incus box over and over again without it making a single change, unless something's actually drifted. That idempotency is honestly the quiet superpower here, you stop wondering whether a box is in the right state and just get to assert that it is.
Now services.sls is where it gets a little interesting. It's once again pretty simple, a call to service.running with a list of services that should be both running and enabled, but pay attention to that little require at the tail end.
incus-services:
service.running:
- names:
- sshd
- incusd
- acpid
- crond
- syslog-ng
- zfs-import # import pools at boot
- zfs-mount # mount zfs datasets
- zfs-zed # zfs event daemon (scrub/health notifications)
- enable: True
- require:
- pkg: incus-packages
That require is the thing. It gates this entire step to the success of the incus package install, so if packages didn't land for some reason, a mirror was down, an update broke, whatever, then the service step never even fires. No services pointed at software that isn't there, no half broken box sitting in some ambiguous middle state. It either works or it does not work, it cannot and will not attest to both. Hold onto that line, because I'm about to get bitten by it on purpose.
And finally we have config.sls, which does two things, and like the services step both of them are gated behind the success of everything before them. The first renders my templated preseed.yaml and drops it onto the target box, and the second feeds that file to incus admin init to bring up a working incus configuration.
incus-preseed:
file.managed:
- name: /etc/incus/preseed.yaml
- source: salt://common/incus/templates/preseed.yaml.jinja
- template: jinja
- makedirs: True
- require:
- pkg: incus-packages
incus-init:
cmd.run:
- name: incus admin init --preseed < /etc/incus/preseed.yaml
- unless: incus storage list --format csv | grep -q .
- require:
- service: incus-services
- file: incus-preseed
Of course it wouldn't be very idempotent if it re-initialized incus every single time I ran it, so that second bit only fires when there are no storage pools defined yet, which is what that unless clause is checking. I'll admit, that last bit isn't truly idempotent, it's more of a quick hack that just happens to be easily enforceable at the scale my lab actually exists at.
So does it actually work?
Well, I ran it, and it failed, in just about the most reassuring way it possibly could have.
salo:~# salt praa state.apply common/incus
praa:
----------
ID: incus-packages
Function: pkg.installed
Result: True
Comment: All specified packages are already installed
...
----------
ID: incus-services
Function: service.running
Name: incus
Result: False
Comment: The named service incus is not available
...
----------
ID: incus-init
Function: cmd.run
Name: incus admin init --preseed < /etc/incus/preseed.yaml
Result: False
Comment: One or more requisite failed: common.incus.services.incus-services
Changes:
Summary for praa
------------
Succeeded: 9 (changed=1)
Failed: 2
------------
Total states run: 11
Total run time: 1.396 s
See, I'd told salt the service was named incus, but on Alpine the daemon is actually incusd. One little letter off. The service step came back Result: False, The named service incus is not available, and then incus-init checked its requisite, saw the service step had failed, and just refused to run, One or more requisite failed. Failed: 2, both of them pointing right at my typo, and nothing got built on top of it. That's the "it cannot attest to both" bit paying off, a shell script would've just run past the bad service name and tried to init incus anyway, and I'd have spent twenty minutes figuring out why my hypervisor was borked.
So I fixed the name to incusd and re-applied.
salo:~# salt praa state.apply common/incus
praa:
----------
ID: incus-packages
Function: pkg.installed
Result: True
Comment: All specified packages are already installed
----------
ID: incus-services
Function: service.running
Name: incusd
Result: True
Comment: The service incusd is already running
----------
# ...sshd, acpid, crond, syslog-ng, and the three zfs services, all already running...
----------
ID: incus-preseed
Function: file.managed
Name: /etc/incus/preseed.yaml
Result: True
Comment: File /etc/incus/preseed.yaml is in the correct state
----------
ID: incus-init
Function: cmd.run
Name: incus admin init --preseed < /etc/incus/preseed.yaml
Result: True
Comment: unless condition is true
Changes:
Summary for praa
-------------
Succeeded: 11
Failed: 0
-------------
Total states run: 11
Total run time: 1.715 s
Clean run this time, Succeeded: 11 and Failed: 0, and the bit I really love is incus-init down at the bottom, Result: True but Comment: unless condition is true. It didn't actually run! The storage pool already existed from the first good pass, so the unless clause caught it and salt just left it alone. That's the idempotency I keep going on about, sitting right there in the output, and if I run it again tomorrow I'll get the same eleven successes and zero changes.
So that's really it, three state files and a little jinja templated yaml, and a bare node turns into a hypervisor. The genuinely annoying part is that it was done before I was, because while I was still fussing over how to write this post praa was already up and running.
salo:~# salt praa cmd.run 'incus ls'
praa:
+--------------+---------+--------------------+------+-----------+-----------+
| NAME | STATE | IPV4 | IPV6 | TYPE | SNAPSHOTS |
+--------------+---------+--------------------+------+-----------+-----------+
| alpine-test | RUNNING | 10.85.49.95 (eth0) | | CONTAINER | 0 |
+--------------+---------+--------------------+------+-----------+-----------+