The home lab
Everything I build runs on hardware I own. The home lab is the substrate — two AMD Strix Halo machines linked over Thunderbolt 5, plus a Mac Mini for daily dev work. AWS is for things that need to survive a power cut.
192GB
Combined RAM
160GB
VRAM
8
Always-on models
9+
AWS accounts
Nodes
Furnace
Primary compute & gateway
GMKTec EVO-X2
AMD Strix Halo · 128GB RAM · 96GB VRAM · 1.9TB NVMe + 932GB SSD
Services
- ›Forge LLM gateway (port 8642)
- ›Llama 3.3 70B (RPC offload to Crucible)
- ›Qwen3.5 35B-A3B priority router
- ›Qwen3-VL 32B (vision)
- ›nomic-embed (embeddings)
- ›Whisper (transcription)
- ›InsightFace (face detection)
- ›Florence-2 OCR
- ›Chatterbox TTS
- ›PostgreSQL 16 (ARIA + Nexus)
- ›Caddy + CoreDNS
- ›Prometheus + Grafana
Crucible
Satellite compute
GMKTec EVO-X2
AMD Strix Halo · 64GB RAM · 64GB VRAM · Thunderbolt 5 to Furnace (40Gbps, 0.12ms)
Services
- ›nexus-scaler (autoscaler for bulk workers)
- ›RPC worker (Llama 70B tensor layers)
- ›ComfyUI / FLUX.1 image gen (planned)
Anvil
Dev environment
M4 Mac Mini
Apple Silicon · 16GB unified memory
Services
- ›Claude Code workstation
- ›ARIA Relay daemon
- ›Photo + media sync agents
Bellows
Out-of-band management
GL.iNet GL-RM1 KVM
Remote KVM @ 10.0.0.250
Services
- ›Furnace remote console
- ›Power cycle / BIOS access
The stack
Compute
- ·AMD Strix Halo ×2 (Furnace + Crucible)
- ·M4 Mac Mini (Anvil)
- ·Thunderbolt 5 interconnect (40 Gbps)
Inference
- ·llama.cpp (primary runtime)
- ·vLLM (batch + bulk)
- ·Forge — OpenAI-compatible gateway over everything
- ·8 always-on models, ~65GB VRAM baseline
Data
- ·PostgreSQL 16 (ARIA + Nexus, ~250 tables)
- ·SQLite (request logging, edge caches)
- ·S3 cold storage (media archives)
Network
- ·Tailscale tailnet (zero-trust)
- ·CoreDNS for *.niclydon.io
- ·Caddy with wildcard TLS
- ·Cloudflare for public DNS + edge
Cloud
- ·AWS Organization with 9+ accounts
- ·Centralized secrets in AWS Secrets Manager
- ·Vercel for marketing sites (this one)
- ·Cloud LLMs as fallback only
Observability
- ·Prometheus (15s scrape)
- ·Grafana (25 dashboards)
- ·10 alert rules
- ·Centralized log monitor every 60s
Topology
┌─────────────┐ ┌─────────────┐
│ Anvil │ │ Bellows │
│ M4 Mac │ │ KVM/IPMI │
└──────┬──────┘ └──────┬──────┘
│ │
│ Tailscale │ OOB
│ │
┌──────┴───────────────────────┴──────┐
│ FURNACE │
│ Strix Halo · 128GB · 96GB VRAM │
│ Forge · PostgreSQL · Caddy │
│ Prometheus · Grafana · ARIA │
└──────────────────┬──────────────────┘
│ Thunderbolt 5
│ 40 Gbps · 0.12ms
┌──────────────────┴──────────────────┐
│ CRUCIBLE │
│ Strix Halo · 64GB · 64GB VRAM │
│ RPC worker · nexus-scaler │
└─────────────────────────────────────┘