Just before Christmas, I spent an hour or so setting up uncloud on my homelab, and I am stunned at how easy it was to get working.
The motivation for doing this is because I’ve known for a long time that Swarmpit is basically abandoned. Disappointing, but true. The latest release of DietPi, my preferred distro for my Raspberry Pi and RockChip SBCs, included an update to docker and docker-compose that completely broke all operability with Swarmpit. Queue panicked hunting for alternatives and a fortuitous discovery of Uncloud
Here's how I set it up.
DNS
- Added a wildcard
CNAME DNS record pointing *.suranyami.com to my dynamic DNS address: suranyami.duckdns.org.
Tailscale
- Installed
tailscale on each of the machines (Installation Instructions)
- Because I'm using DietPi, the recommended way to install
tailscale is using dietpi-software.
- Connect each machine to my free tailnet (free tier allows up to 100 nodes) using
sudo tailscale up and following the on-screen instructions.
This gives me a stable URL for each individual machine that I can SSH into without needing to do NAT redirection on the router. For instance, my machine called node1 is available to me (and only me) at ssh dietpi@node1.tailxxxxx.ts.net.
SSH config
- Updated my
~/.ssh/config with entries for all the machines that look like this:
Host node1
Hostname node1.tailxxxxx.ts.net
User dietpi
Uncloud
- Installed uncloud on my laptop:
curl -fsS https://get.uncloud.run/install.sh | sh
- Initialized the cluster by picking one of the above machines as a first server:
uc machine init dietpi@node1.tailxxxxx.ts.net --name node1
NAT port redirection
- Because all my machines are behind NAT, I configured my router to map ports
80 and 443 to point to the above machine. This can ultimately be any of the machines configured after this. The important point here is that at least one of the machines running the caddy reverse-proxy that uncloud installs, needs to be receiving ports 80 and 443 from the outside world.
Add more machines
- Add other machines using
uc machine add dietpi@node2.tailxxxxx.ts.net --name node2
Deploy services
- Deploy services using
uc deploy -f plex.yml where plex.yml is a subset of a docker-compose file, but with minor changes. For instance, to deploy to a specific machine (which I have to do because I need to redirect port 32400 from the router to a specific machine, because plex is annoying like that), I do this:
services:
plex:
image: linuxserver/plex:arm64v8-latest
# ...
x-machines:
- node2
x-ports:
- 32400:32400@host
- plex.suranyami.com:32400/https
And that's about it. No manual reverse-proxy configuration, no manual entry of IP addresses, everything is just automatically given a letsencrypt SSL certificate and load-balanced to wherever the servers are running.
This is honestly the easiest way to self-host anything I've found.
It's been 2 weeks or so now, and now that I've got the knack of the x-ports port-mapping syntax, I've also managed to get all my other services running everywhere.
Notable edge cases were:
Minecraft
x-ports:
- 25565:25565@host
Plex
x-ports:
- 32400:32400@host
- plex.suranyami.com:32400/https
Needed 2 mappings, one for the internal subnet for use by the AppleTV, because of some idiosyncrasy of the way the native Plex app works with behind the NAT versus over t'interwebz.
Jellyfin
x-ports:
- 1900:1900@host
- 7359:7359@host
- jellyfin.suranyami.com:8096/https
Only outages I've had so far were purely hardware-related: robo-vacuum somehow knocked out a power cord that was already loose… derp. That won't happen again. And, the fan software wasn't installed on my RockPi 4 NAS box, so it overheated and shut down. Fixed that this morning.
global deployment
I'm currently using Netdata to monitor my nodes. It's WAY overkill for what I'm running, but hey, whatever. For this we need to do a global deployment:
services:
netdata:
image: netdata/netdata:latest
hostname: "{{.Node.Hostname}}"
# ...
volumes:
# ...
- /etc/hostname:/host/etc/hostname:ro
deploy:
mode: global
This is essentially the same as a normal docker-swarm compose file, but because it's not actually docker-swarm, this line is a hack to get the hostname: - /etc/hostname:/host/etc/hostname:ro.
There is also a quirk that (hopefully) might be fixed in future versions of uncloud: the volumes don't get created automatically on each machine. For that I had to execute a bunch of uc volume create commands like this:
c volume create netdataconfig -m node2
uc volume create netdataconfig -m node3
uc volume create netdataconfig -m node4
uc volume create netdatalib -m node2
uc volume create netdatalib -m node3
uc volume create netdatalib -m node4
uc volume create netdatacache -m node2
uc volume create netdatacache -m node3
uc volume create netdatacache -m node4
Replicated deployment
One very nice feature is replicated deployment with automatic load balancing. There's not a lot of documentation about how it works at the moment, so I'm a bit suss on it, but essentially it looks like this in the compose file:
deploy:
mode: replicated
replicas: 4
This will cause it to pick a random set of machines and deploy a container on each, and load-balance incoming requests.
There are caveats to this, of course. The service configuration will need to be on a shared volume, for instance, and some services do NOT behave well in this situation. plex is the worst example of this… if you store its configuration, caches and DB on a shared volume, you are gonna have a very bad time indeed because of race-conditions, non-atomicity, file corruption etc.
Which is a shame, because Plex is the service I'd most like to be replicated. I dunno what the solution is. Use something other than Plex seems like the most obvious answer, but as far as I know the alternatives have the same issue.
Discuss...