to have a highly-available home Internet setup, with no SPOF (Single Point of Failure)
to learn and have fun.
In summary
I have 3 physical machines plugged into 3 switches, with all switches
connected to each other. I don't have a physical
router/gateway. Instead, a Linux virtual machine handles the IPv4 NAT,
IPv6 announcements, DHCP, DNS, etc, and that Linux VM floats between
the 3 machines as needed, including live migration during maintenance.
My 4 Wifi APs are PoE-powered from the two switches. I have two ISPs.
I have two UPSes and two PDUs powering separate halves of the gear,
and separate ISPs, giving me about 35-45 minutes of runtime (and thus
Internet) during a power outage. The whole house might be dark, but the
battery-powered wifi will work.
1 x UniFi Switch 16 XG: 10Gbps Aggregation Switch, primarily for Ceph (but part of same LAN). I only have one of these, but if it fails the Linux bond fails over to the 1Gbps switches.
UniFi Cloud Key to run the Unifi controller. This isn't necessary to boot the cluster. It just runs the pretty UI and is needed to add new devices. I could run the software on a VM too, I suppose. But I had it from earlier, so I'm still using it.
misc Raspberry Pis for monitoring
Power
The whole setup including all APs and switches draws about 220 watts
idle. Power is pretty cheap in Seattle. Washington State (as of April
2018) has the cheapest electricity in the United States, at
$0.0974/kWh.
Proxmox VE is the Debian-based base OS on the servers, and Proxmox is a nice UI for managing qemu VMs and Ceph. I previously tried VMware for about a year, both are annoying in different ways. Proxmox might be a little rough in places, but I prefer it.
Ceph for storage. I love Ceph so much and discovering it makes this whole adventure worth it. Still much to learn, though.
ISC DHCP for the DHCP server. I auto-generate its config from a Go program that has a map of most my important devices' MAC addresses.
CoreDNS for the DNS server on the gateway VM, which lets me encrypt all upstream DNS so ISPs can't see or mess with it. (even though they can see IPs and SNI)
tcpproxy that Dave Anderson and I wrote. I use it on an HA VM to route ingress traffic to various VMs & services.
Config
Network config
The LAN is 10.0.0.0/16.
Untrusted VLAN is 10.2.0.0/16, which the LAN can connect to, but the untrusted machines can't initiate connections back out to.
Gateway, DHCP at 10.0.0.1 (and 10.2.0.1 for untrusted)
DHCP range is 10.0.100-199.x so they're easy to recognize. Likewise for the untrusted VLAN.
Networking gear have static IPs 10.0.6.x (6 is above the letter N on the keyboard, which is how I map letters to numbers usually)
TODO: link to program with dependency graph of all devices, services,
and connections, and to simulate failures to validate there are no
hidden SPOFs.
Past failures
I used to use a Soekris net6501 as my home gateway, but its CPU maxes out NAT'ing about 300 Mbps, sadly, so I started looking at alternatives when I got Centurylink fiber.
A truck once clipped the fiber running to our house. It's nice having a second WAN link.
I used to use a UniFi Security Gateway Pro but it failed one day and wouldn't power on any more. Dave had a backup for me handy, but the Unifi controller software wedged itself and wouldn't let me remove the old (dead) one and thus I couldn't add the new replacement, since you can only have one gateway in a site at a time. I was not amused, and that was the final straw that made me realize I wanted a highly-available setup.
I used to use VMware with highly-available vCenter setup, but the whole thing was felt bloated and slow and enterprisey, and I couldn't stand the Flash UI, which was still required for many operations. That's increasingly going away and being replaced with HTML5, but I also couldn't stand the VMware enterprise-targeted documentation. And I wanted to use something Open Source, too.
Thanks
Much thanks to Dave Anderson for
helping with a lot of this. He has a very similar setup at his home
and we enjoy watching each other both succeed and fail at trying new
things.