· 2 min read · homelab
GPU PCI Address Instability: When Your Card Moves Between Reboots
Dealing with shifting PCI addresses in Proxmox and how to stop your GPU from disappearing or changing IDs after a reboot.
proxmoxgpu-passthroughpciehomelabnvidia
· 5 min read · ai-agents
Cognitive Memory for Agents: Vector Search vs Activation-Based Recall
Comparing vector databases and activation-based memory for AI agents. Trade-offs in latency, scale, and interpretability.
ai-agentsvector-databasesllm-memorycognitive-architecturepytorchfaiss
· 2 min read · infrastructure
AdGuard Home: Network-Wide DNS Filtering with Failover
Setting up AdGuard Home for network-wide DNS filtering with a robust failover strategy to prevent total internet outages.
dnsadguard-homeinfrastructurekubernetesnetworkingfailover
· 3 min read · homelab
Unprivileged LXC + Docker: The runc Sysctl Permission Trap
Fixing the silent failure of sysctl settings when running Docker inside unprivileged Proxmox LXC containers.
proxmoxlxcdockersysctlrunchomelab
· 8 min read · ai-agents
Three-Layer Safety for Autonomous Agents: Stopping the Infinite Loop
Moving beyond prompt engineering to implement token-level schema enforcement, pre-execution gates, and shell-safe execution pipelines for AI agents.
ai-agentsllm-opsmcp-serversollamapydanticaiagent-safety
· 2 min read · infrastructure
Stop Merging Broken YAML: Kubernetes Manifest Validation in CI
Don't let invalid manifests break your GitOps pipeline. Learn how to use kubeconform and Kyverno exclusions to catch errors before they hit production.
kubernetesgitopsci-cdinfrastructurekubeconformkyverno
· 3 min read · homelab
GPU D3cold Power States: How to Brick Your Card Without Trying
Bricking GPUs with D3cold: Real-world gotchas and fixes for Proxmox users
gpu-passthroughd3coldproxmoxnvidiahomelabkernelqemu
· 7 min read · kubernetes
cert-manager + Cloudflare DNS-01: Automated TLS for Everything
Automating TLS with cert-manager and Cloudflare DNS-01 in Kubernetes
cert-managercloudflarekubernetestlsdns01homelabinfrastructure
· 2 min read · kubernetes
SealedSecrets Key Backup: Don't Lose Your Encryption Keys
How to back up and recover SealedSecrets encryption keys in Kubernetes
kubernetessealed-secretsencryptionkey-managementgitopsargocdsecurity
· 3 min read · ai-agents
Ollama on Kubernetes: Recreate Strategy and Single-GPU Deadlock
Deploying Ollama on Kubernetes can lead to GPU deadlocks. Here's how to avoid them.
ollamakubernetesgpu-deadlockrecreate-strategynvidia-runtimepvc-sizing
· 4 min read · iiot
MQTT Broker Selection: HiveMQ vs Mosquitto for Industrial Use
Comparing HiveMQ and Mosquitto for industrial IoT: scalability, security, and reliability
mqtt-brokeriiotsecurityscalabilityindustrial-iot
· 4 min read · infrastructure
Wildcard DNS + ndots:5: The TLS Nightmare and How to Fix It
Kubernetes default DNS settings can cause TLS certificate mismatches when using wildcard DNS. Here is how to debug and fix it.
kubernetesdnstlsnetworkinginfrastructure
· 5 min read · ai-agents
Self-Improving AI Infrastructure: How Your Homelab Wiki Updates Itself
How to automate your homelab wiki with self-improving AI infrastructure
ai-agentsself-improving-systemshomelabautomationinfrastructurekuberneteslonghorn
· 6 min read · ai-agents
The 6-Layer Memory Architecture I Run for Claude Code
Open-sourcing the memory system behind my Claude Code setup: CLAUDE.md, path-scoped rules, wiki, vector search, cognitive memory. With the mistakes.
ai-agentsclaude-codememoryragllm-wikimcphomelabkubernetes
· 6 min read · ai-agents
Building Karpathy's LLM Wiki: A Production Homelab Implementation
Implementing Karpathy's LLM Wiki in a homelab with real-world lessons and gotchas
ai-agentsllm-wikihomelabkubernetesproxmoxinfrastructure
· 1 min read · homelab
Proxmox API Tokens: Bash History Expansion and the ! Character
Bash history expansion breaks Proxmox API tokens — here's how to fix it
proxmoxbashapi-tokenshomelabsecurityscripting
· 3 min read · homelab
AMD iGPU Stealing Your RAM: UMA Frame Buffer on Headless Servers
AMD iGPU steals RAM on headless servers, here's how to fix it
headless-serversamd-igpuram-leakumf-frame-bufferproxmoxhomelabkubernetes
· 5 min read · ai-agents
Agent Credential Management: Two-Tier Service Accounts for Secure AI Agent Workflows
Managing agent credentials with two-tier service accounts: a secure approach for AI agent orchestration
ai-agentscredential-managementsecurityservice-accountsmulti-agent-systemskubernetes
· 5 min read · infrastructure
Infrastructure as Code, but Automated: OpenTofu and GitHub Actions
Stop manual applies. Learn how to build a production-ready CI/CD pipeline for your infrastructure using OpenTofu and GitHub Actions.
infrastructureopentofugithub-actionsgitopscicd
· 2 min read · iiot
Equipment Health Scoring: How One Number Made My Operators Stop Checking the Dashboard
Turn raw sensor data into a single, actionable health score operators actually use
iiotpredictive-maintenancesensor-dataequipment-healthdashboard-design
· 3 min read · kubernetes
Pod Disruption Budgets: Why kubectl drain Gets Stuck on Longhorn
Pod Disruption Budgets can block kubectl drain on Longhorn. Here's how to avoid it.
kuberneteslonghornpod-disruption-budgetsnode-drainstorage
· 4 min read · ai-agents
Attention Residuals: How Kimi Is Rethinking Transformer Depth
Kimi's Attention Residuals replace fixed residual connections with learned layer aggregation. What it means for LLM depth.
aitransformersllm-architectureattentiondeep-learningkimi
· 1 min read · kubernetes
Helm fullnameOverride: Naming Sanity in ArgoCD
Avoid naming chaos in ArgoCD by using Helm fullnameOverride effectively
helmargocdkubernetesnamingcharts
· 3 min read · ai-agents
NVIDIA Container Toolkit: Why the Default Runtime Matters
Fixing default runtime misconfigurations in NVIDIA Container Toolkit for GPU workloads
nvidia-runtimecontainerdkubernetesai-agentsgpu-container
· 2 min read · homelab
DOCP/XMP: Why Your Proxmox Node Runs at Half RAM Speed
DOCP/XMP enabled but Proxmox still runs at half RAM speed? Check your kernel and BIOS.
proxmoxhomelabxmpdocpramperformance
· 3 min read · homelab
AMD Ryzen C-State Freezes: How `processor.max_cstate=1` Saved My Proxmox Node
Ryzen freezes in Proxmox? Learn how to disable deep C-states and stop random system lockups.
proxmoxryzenhomelabc-statekubernetes
· 6 min read · kubernetes
Kubernetes Storage on Bare Metal: Longhorn in Practice
How I configured Longhorn storage for a Kubernetes cluster on bare metal — what worked, what didn't, and what I'd do differently next time.
kubernetes-storagelonghornbare-metalhomelabinfrastructure
· 7 min read · homelab
GPU Passthrough on Proxmox: A Field Guide to the Gotchas That Bit Me
The documentation won't warn you about D3cold bricking, PCIe bus renumbering, or why the NVIDIA device plugin silently fails. This is that guide.
proxmoxgpu-passthroughhomelabnvidiakubernetespci-passthrough
· 8 min read · ai-agents
Building MCP Servers with FastMCP: Stop Writing Boilerplate, Start Writing Tools
FastMCP makes building Model Context Protocol servers feel like FastAPI. Here's how to go from zero to a working MCP server in under an hour.
mcp-serversai-agentspythonllm-toolingdeveloper-tools
· 7 min read · kubernetes
GitOps for Homelabs: How ArgoCD App-of-Apps Scales Your Cluster
How the ArgoCD app-of-apps pattern brings real GitOps discipline to homelab Kubernetes — repo structure, examples, and what I'd do differently.
gitopsargocdkuberneteshomelabcontinuous-delivery
· 9 min read · ai-agents
Multi-Agent AI Systems: Architecture Patterns That Actually Work
A practical guide to designing multi-agent AI systems — orchestrator patterns, trust boundaries, and the tradeoffs I learned running agents in production.
ai-agentsmulti-agentllmarchitectureorchestrationmcp-servers
· 9 min read · homelab
Building a Production Homelab: Multi-Node Proxmox Cluster with Kubernetes
How I built a multi-node Proxmox cluster running Kubernetes with GPU passthrough, GitOps, and dozens of services — and what broke along the way.
proxmoxkuberneteshomelabgitopsgpu-passthroughlonghornargocd