skip to content
Theme Test

Zero-Trust Networking in a Homelab

Table of Contents

The Problem with VLANs

Every homelab eventually reaches the point where the network diagram looks like spaghetti. Mine hit that wall around the time I was running six VLANs, a pile of firewall rules I couldn’t fully explain, and a growing suspicion that half of them were wrong.

The core issue wasn’t complexity per se — it was that the complexity wasn’t buying me anything. VLANs are a layer-2 segmentation tool designed for enterprise switch fabrics. In a homelab with two Proxmox nodes and a NAS, they’re mostly cosplay.

What Zero-Trust Actually Means Here

The term gets thrown around a lot, usually by vendors selling identity products. In a homelab context, it’s simpler:

  • No implicit trust based on network position. Being on the “management VLAN” doesn’t grant access to anything.
  • Identity-based access. Every connection is authenticated and authorized per-service.
  • Encrypted by default. Everything over WireGuard or TLS, no exceptions.
Terminal window
# Tailscale ACL snippet — declarative access control
{
"acls": [
{ "action": "accept", "src": ["tag:admin"], "dst": ["*:*"] },
{ "action": "accept", "src": ["tag:monitoring"], "dst": ["*:9090", "*:9100"] },
{ "action": "accept", "src": ["tag:backup"], "dst": ["tag:nas:*"] }
]
}

The ACL file is the entire network policy. No firewall rules scattered across three devices. No “wait, which VLAN is that on?” conversations with yourself at 11pm.

The Migration

I didn’t rip everything out at once. The approach:

  1. Install Tailscale on every node. Proxmox hosts, VMs, containers, NAS boxes.
  2. Move services to Tailscale addresses. One at a time, starting with monitoring.
  3. Add ACL tags. Group machines by role, not network location.
  4. Remove VLAN dependencies. Once a service was fully on Tailscale, flatten its networking.
  5. Keep one VLAN for IoT quarantine. Some devices can’t run Tailscale and shouldn’t be trusted.

What Broke

DNS. Always DNS. Tailscale’s MagicDNS is excellent for hostname.tailnet resolution, but I had a lot of hardcoded IPs and custom DNS records. Took about a week of finding things that quietly stopped working.

The other surprise: Docker networking. Containers inside a Docker bridge network can’t directly use the host’s Tailscale interface without some plumbing. I ended up using network_mode: host for services that needed Tailscale access, which isn’t ideal but works.

# Docker Compose — Tailscale sidecar pattern
services:
tailscale:
image: tailscale/tailscale:latest
hostname: grafana-ts
environment:
- TS_AUTHKEY=${TS_AUTHKEY}
- TS_STATE_DIR=/var/lib/tailscale
volumes:
- tailscale-state:/var/lib/tailscale
cap_add:
- NET_ADMIN
grafana:
image: grafana/grafana:latest
network_mode: "service:tailscale"
depends_on:
- tailscale

Results

After three weeks of gradual migration:

BeforeAfter
6 VLANs1 (IoT quarantine only)
~80 firewall rules12 ACL lines
3 places to check access1 ACL file in git
”Is this port open?” debuggingtailscale status

The biggest win isn’t technical — it’s cognitive. I can look at one file and understand who can talk to what. That was never true with the VLAN setup.

When This Doesn’t Work

A few cases where the old approach is still better:

  • High-bandwidth local transfers. NFS mounts between Proxmox and NAS should stay on the physical network. WireGuard encryption overhead matters at 10Gbps.
  • PXE boot and IPMI. These protocols predate the concept of identity. They need a flat network segment.
  • Latency-sensitive workloads. The WireGuard hop adds ~1ms. Usually irrelevant, but not for real-time audio/video processing.

The Takeaway

Homelabs are learning environments. The VLAN phase taught me how enterprise networking works. The zero-trust phase taught me that most of that complexity exists because enterprises can’t easily change their foundations — and I can.

If your homelab network makes you nervous to touch, that’s the signal. The whole point is to be able to break things and fix them. Simpler foundations make that safer.