Slightly Nerdy


After a few months of getting comfortable with NixOS as my desktop operating system, I decided it was time to try it out for servers. But first I wanted to write about the setup I had before.

Disclaimer: This post is likely full of bad ideas – you probably shouldn't setup anything you care about like this. It is my opinion that my most valuable learning is when I'm learning what not to do, and I know there are some gems lurking in here.

How it started

At home, I was running a 8th-gen i5 NUC doing double duty as my home router and server, called homegw. It was running Ubuntu 22.04 with LXD. Router stuff happened on the base OS, with dnsmasq, AdGuard Home and a bunch of ufw rules. It also ran xl2tpd to bring up my Andrews and Arnold LT2P connection so I can have real IPv6 here, which my cable ISP doesn't provide.1

There was an LXD VM for Home Assistant, and LXD containers for a backup server (restic and Arq over SSH), for Plex, and one running various tools for... ahem downloading Linux ISOs.

There was a Wireguard tunnel from this machine out to Mullvad, and a static route with NAT to on this link. Mullvad operates a SOCKS5 proxy on this address, so I configured all my Linux ISO downloading tools to use this proxy.

Another Wireguard tunnel connected to a dedicated server, hetty, running in Hetzner's Falkenstein data centre. It was nabbed from their server auction a little over a year ago. It was an 8-core 9th-gen i9, 128Gb RAM and 2x1Tb NVMe drives, for the bargain price of 54€ per month. In retrospect, way more computer than I needed, and more money than I should be spending.

This too ran Ubuntu 22.04, with LXD. In this case, a single LXD container called kubeservices running microk8s. I had Kubernetes setup with flux, a GitOps tool that keeps your cluster in sync with a bunch of YAML defined in Git. Hosted here were:

  • This blog (Writefreely with MySQL)
  • The CrunchyData Postgres Operator providing PostgreSQL instances for most of the other services mentioned here
  • My Mastodon instance, with Elasticsearch and Libretranslate
  • Vaultwarden, a full featured and self hosted Bitwarden server
  • Synapse, a Matrix chat server
  • Prometheus, for gather metrics on all the services, including from Home Assistant on the other end of the Wireguard tunnel, so I can get pretty graphs on my home electricity generation/usage, heating, etc.
  • Grafana for visualising those metrics
  • Keycloak for single sign-on to most of the above services

Kubernetes PersistentVolumeClaims provided all the stateful storage, using the microk8s host path provisioner, all ultimately ending up in a single directory on the host, an LXD custom volume with daily snapshots and backed up to OneDrive2 and to homegw with restic.

I'd chosen to use LVM on top of a LUKS device made from an mdadm initialised RAID1 mirror. ZFS was tempting, but whilst this was easy to setup with Ubuntu Desktop, it was less straightforward with Ubuntu Server. So I took the easy road. Or so I thought.

LXD creates a ThinPool for it's storage when using LVM, which eventually bit me when it's space reserved for metadata filled up. Despite having plenty of free data space, I couldn't figure out how to reallocate more space for metadata (I don't think you can). So I ended up ejecting the second SSD from the RAID1 mirror, and adding it as a new disk to the volume group in order to expand the size of the thin pool in order to recover.

That was my first big “this was a mistake” moment, letting LXD allocate all the remaining volume group space for a single Thin Pool was a bad idea. Ultimately I think LVM was not a good choice for this kind of setup, and I wouldn't make it again.

Oh, and I'd made the same choice of LVM with LXD on homegw. Multiple times I filled the disk and had a bit of a nightmare recovering from it. One does not simply.

Another “this was a mistake” moment was from filling OneDrive with restic backups. It turned out the Postgres Operator by default creates a single initial backup using pgrestbackup, and then captures the write-ahead log forever. Eventually my daily restic snapshots, even with regular pruning, filled the dedicated OneDrive account I'd setup for the job. At that point, restic could simply not recover. Any kind of pruning operation needed some amount of storage space in OneDrive that impossible to provide. You can pay for extra storage, but only on the primary Microsoft 365 account, so I couldn't buy my way out of it. In the end I trashed the entire restic repository and started again.

A simple change to the PostgresCluster YAML switched PGO to giving daily backups, storing up to 3 in the cluster. My disk usage went down to 600Gb to 150Gb after 3 days.

        repo1-retention-full: "3"
        repo1-retention-full-type: count
      - name: repo1
          full: "0 4 * * *"

Another consequence of filling the metadata for the thin pool mentioned earlier was the disk switched to read-only. Upon recovery I had some corrupted Postgres DBs and needed to restore from backup. The Postgres Operator makes that possible with by poking at different parts of the YAML. It's an incredible tool, but not knowing it inside out, I spent a lot of time feeling frustrated that I couldn't just jump on the server and fix things. Instead, everything is orchestrated, and it's feels a bit like operating a light switch with a broom stick.

For my use, I don't need anything like the capabilities PGO offers, and I should KISS.

The big takeaway

Understand your tools. Read the docs. Be curious about what can go wrong. Test those scenarios at your leisure, not in production.

Next time

All of that is gone. I'll be back to describe what replaced it.

1 Notably, I've found IPv4 performance is often better over this link too, with lower ping times to many sites with AA compared to Virgin Media, despite having to transit VM to get to AA first.

2 You can pick up Microsoft 365 Family for under 50GBP per year if you watch out for offers on the “gift card” version of it. This gives you 6x 1Tb OneDrive accounts, which is some of the cheapest cloud storage out there. Encrypting what you put there is a good idea, so tools like restic with rclone are your friends.

#hosting #kubernetes #ubuntu #linux

Updated 2023-11-13 with features, prompted by Neil's helpful response Desktop Linux: the software I'm currently using

With my new laptop coming in a few days, I'm finally thinking through the implications of moving away from macOS whilst still having an iPhone. These are some questions I need to answer:

  • How will I listen to music?
    • Currently streaming with Apple Music, blended by the Music app with my local music collection
  • How will I manage photos?
    • Currently in Photos, synced with iCloud, mostly originating from my phone, with a collection of old stuff originally synced from the Mac
  • How will I do quick annotations on screenshots, markup PDFs, resize/crop/export images without
  • What Passkeys do I need to migrate out of iCloud (or the Mac specifically)?
  • What other credentials are in my Mac/iCloud keychain that aren't in 1Password for some reason?
  • Safari has been my primary browser and macOS, and will likely continue to be on iOS
    • I tend to throw stuff at Reading List to pick up later, which I won't have access to on my new computer, so what to do instead?
  • Should I keep an old Mac around, running?
    • Logged into iCloud, it could provide another authentication factor for iDevices, which assume you've got some other Apple device nearby to do authentication
    • There are bridges for iMessage that could let me continue to read/reply to iMessage and SMS from my phone
    • I have an old MacBook Pro with a non-functioning screen that's too expensive to fix that could do the job
  • I subscribe to Microsoft 365, mainly for cheap cloud storage (effectively 6Tb for around ~£45 per year when bought on semi-frequent offer), but do use Excel for a few tasks. Should I...
    • Migrate away to something else?
    • Use the web version?
    • Try Wine/Crossover/Windows VM?
  • I almost entirely interact with Mastodon through Ivory, on my phone and with the macOS app. I really like that it stays in sync across both with my read position. Am I just going to use the iOS app now, or is there some other solution?

I guess I'll find my answers in the coming weeks.

#macos #linux

That whole writing more thing went well, didn't it? ?

Let's try a little brain dump of the nerdy things I've assigned myself to do:

  • Replace macOS with NixOS as my primary computing experience.
    • Being a cloud infrastructure engineer, I love me some declarative configuration.
    • Apple annoyed me with the offer of a £700 fix for the most expensive computer I ever purchased that was just out of warranty and had display issues.
    • The £700 was to replace a display that's screwed because the connecting cable has been damaged by their hinge design. There is a company in London that offer a £300 repair. Still quite ouchy.
    • It's an Intel Mac, how long are they even likely to support it anyway?
    • It was already passed on to Sean, replaced for me by a 14” M1 Pro.
    • Framework seems kinda great, so I'm eagerly awaiting my Ryzen-based Framework 13. At which point Sean can have the 14” and become untethered from a desk again.
    • I'll miss the screen of the MacBook, but I think even more I'll miss the speakers.
  • Capture my (non-phone) computing world into a Git repo.
    • Nix allows me to describe my systems and my user environments in code!
    • On that whole phone thing, wouldn't it be nice if there were some genuinely open-source phone ecosystem without Google that could actually run the apps I've come to depend on (banking, etc)?
    • I've sadly accepted my iPhone 12 mini will eventually be replaced with another iPhone.
  • Adopt a more keyboard-centric computing life, with a tiling window manager.
    • Endless configuration tweaking awaits.
  • Have a go at (neo)vim being my primary editor again, or otherwise invest properly in VSCod(e|ium).
    • I might as well go all in, eh?
    • Failing at this and continuing to use GoLand and PyCharm wouldn't be terrible.
  • Migrate my Hetzner dedicated server running this site, and my Masto instance to something else (self-host, Hetzner cloud).
    • It's way over-specced for my current use (but 50€ for 128GB RAM, 2x1Tb SSD and 8-core i9-9900K is amazing value).
    • I use microk8s on it as a single node instance, which is pretty inflexible, storage being a large part of sense of unease with that.
    • I made the mistake of choosing LVM and let LXD create a thin-pool consuming 100% of the remaining storage.
    • It ran out of metadata space, and you can't grow it into the unused data space, so I had to de-RAID1 the SSDs to make things work so that's a bit of a mess now.
    • Said mess returns errors when trying to do operations like take snapshots for backups, or even delete old snapshots, a reboot might fix it, or it might leave me with a complex failed boot to resolve.
    • Hetzner Cloud seems to be pretty darn cheap, and I can create a proper Kubernetes environment with separate control plane and real storage volumes and come in at similar or lower price than the dedicated server, albeit with less overall resources and with non-dedicated CPU.
    • I experimented briefly with Talos Linux on it yesterday, and it went pretty smoothly.
    • I can Terraform it (or OpenTF eventually, right?)
    • Or maybe I can just host it all at home on a single box?
  • Migrate my home server from Ubuntu with LXD to NixOS with something.
    • It's an 8th gen NUC in a fanless Asaka case.
    • It's my router and adds IPv6 to my home Internet with AAISP's L2TP service, because Virgin Media can't even, but I'm definitely taking their 1Gb/100Mb service over my next best option of 40Mb/10Mb DSL.
    • It's running an LXD-launched VM for Home Assistant, and LXD containers for Plex, some *aars, etc.
    • It's also built on the same LVM thin pool scheme as the Hetzner box, so is ultimately doomed.
    • I'm clearly going to NixOS it, and microvms looks super interesting.

I can't help think of Amelie's father and his toolbox.

#nixos #linux #cloud