Cloud-Native Deep Dive For Production

Multi-Model Failover In Your AI Gateway

09 May 2026 4 min read agentgateway

Think about two scenarios that are pretty common. 1) You hit a rate limit or run out of tokens, so you have to "downgrade" to a small/less powerful Model. 2)

Managing an Agents Uptime (Reliability Engineering for Agents)

02 May 2026 6 min read Agentic

"treat 'em like cattle, not pets". This was, and continues to be, how many look at Kubernetes Pods and microservice-based architecture. It makes a lot of sense for objects like

Configuring Tool Traces In Your MCP Gateway

26 Apr 2026 7 min read agentgateway

An Agent makes a call to an LLM. The LLM decides which MCP server tool should be used for a task. The Agent then makes a call to said tool. This can happen

Building Your Production-Grade SRE Agent

19 Apr 2026 6 min read AI

AI is only as good as the information you provide it. Aside from the general hallucinations and wild outcomes we sometimes see from LLMs, the general gist of an Agent not performing as

Making Your Agent Model-Aware With Inference Extension vLLM, & Routing

11 Apr 2026 6 min read Inference Extension

Your Agent has a "mind of its own" (well, it was programmed to act a particular way). For example, Claude Code is known to downgrade your Model for particular tasks to

Build AI Agents on Kubernetes: Kagent + Amazon Bedrock Setup Guide

Context-Aware Networking & Runtimes: Agentic End-To-End

FinOps For Agentic: How To Capture Token Usage Cost Across LLMs

Deploying Local AI Agents In Kubernetes

AWS EKS Model Context Protocol (MCP): How It Improves Kubernetes Reliability

Multi-Model Failover In Your AI Gateway

Managing an Agents Uptime (Reliability Engineering for Agents)

Configuring Tool Traces In Your MCP Gateway

Building Your Production-Grade SRE Agent

Making Your Agent Model-Aware With Inference Extension vLLM, & Routing