AI Gateway for Multi-Provider LLMs
Connect any AI-powered IDE or CLI tool through OmniRoute — free API gateway for unlimited coding.
📡 All agents connect via localhost:20128/v1 or cloud.omniroute.online/v1 — one config, unlimited models and quota
The biggest release ever — 31 new providers, MCP Server, A2A Protocol, and much more.
Auto-provision API keys programmatically with per-provider and per-account quotas. SHA-256 hashed, idempotent issuance, budget limits, and optional GitHub issue reporting.
Map model name patterns (glob) to specific combos. claude-sonnet* → code-combo, gpt-4o* → openai-combo. Dashboard UI with inline management.
Self-healing routing with 6-factor scoring, 4 mode packs, bandit exploration, progressive cooldown, and probe-based provider re-admission. The AI manages your AI.
Full media generation at /dashboard/media: Image Generation, Video, Music, Audio Transcription (2GB uploads), and Text-to-Speech with multiple voice providers.
5 search integrations: Perplexity, Serper, Brave Search, Exa, and Tavily. Ground AI responses with real-time web data and search analytics dashboard.
Automatically refreshes model lists for connected providers every 24 hours. New models appear without manual updates. Configurable interval.
130+ provider SVG logos via @lobehub/icons. Applied across Dashboard, Providers, and Agents pages with SVG → PNG → generic fallback chain.
Per-API-key request limits with sliding-window enforcement. Set max_requests_per_day and max_requests_per_minute per key. HTTP 429 on exceed.
Install once, connect providers, and code non-stop with automatic 4-tier fallback routing.
Install globally, connect your providers, and start coding with smart auto-fallback routing.
Run one command to install OmniRoute globally on your system.
Open Dashboard and add your API keys or OAuth connections. Free providers available!
Configure Claude Code, Cursor, Cline, or any OpenAI-compatible tool.
Run OmniRoute as a container with persistent data volume.
See how OmniRoute compares to alternatives.
| Feature | OmniRoute | LiteLLM |
|---|---|---|
| Providers Supported | 67+ | 100+ |
| Free Tier Routing | ✓ | ✗ |
| Dashboard UI | ✓ | ✗ |
| Semantic Cache | ✓ | ✗ |
| Circuit Breaker | ✓ | ✓ |
| 9 Routing Strategies | ✓ | ✗ |
| LLM Evaluations | ✓ | ✗ |
| Translator Playground | ✓ | ✗ |
| CLI Tools Manager | ✓ | ✗ |
| Custom Combos | ✓ | ✗ |
| MCP Server (16 tools) | ✓ | ✗ |
| A2A Protocol (Agent-to-Agent) | ✓ | ✗ |
| Desktop App | ✓ | ✗ |
| Usage Analytics | ✓ | ✓ |
| Cost Management | ✓ | ✓ |
| Docker Deploy | ✓ | ✓ |
| Media Playground (Image/Video/Audio/TTS) | ✓ | ✗ |
| Registered Keys API | ✓ | ✗ |
| Auto-Combo Engine (Self-Healing) | ✓ | ✗ |
| Web Search Providers (5) | ✓ | ✗ |
| Per-Model Combo Routing | ✓ | ✗ |
| 130+ Provider Icons (SVG) | ✓ | ✗ |
| Self-hosted & Free | ✓ | ✓ |
Connect via OAuth, API Key, or use completely free providers.
iFlow AI
Qwen Code
Kiro AI
Gemini CLI
Claude Code
OpenAI
Anthropic
Google AI
Antigravity
OpenClaw
Groq
DeepSeek
xAI (Grok)
Mistral
Together AI
Fireworks
Perplexity
Cerebras
Cohere
OpenRouter
GLM (ZhipuAI)
MiniMax
Moonshot
Nebius
NVIDIA
Sambanova
Novita AI
Chutes AI
Kluster AI
InfiniAI
Targon
AI21 Labs
Lambda
Lepton AI
Deepgram
Alibaba DashScope
LongCat AI
Pollinations
AI/ML API
Kimi Coding
Alibaba Coding
Ollama Cloud
Everything you need to route, monitor, and optimize your AI usage.
Subscription → API Key → Cheap → Free. Automatic switching when quota runs out, zero downtime.
When a model is unavailable, automatically falls back to sibling models in the same family before returning an error.
Round-robin, weighted, random, strict-random, fill-first, P2C, cost-optimized, priority, and auto-combo with 6-factor scoring and self-healing. Per-combo or global.
Auto-open and close per-provider with configurable cooldowns. Self-healing after failures.
Mutex + automatic rate-limiting for API key providers. Prevents quota exhaustion spikes.
5-second dedup window for duplicate requests. Saves tokens and prevents double-sends.
Two-tier cache (exact + semantic similarity) reduces cost and latency for repeated queries.
Seamless OpenAI ↔ Claude ↔ Gemini format translation. Use any model with any client.
Automatically parse and handle <think> tags from reasoning models like DeepSeek R1.
Built-in protection against prompt injection attacks on your AI endpoints.
Automatically selects the best model based on content type — coding, analysis, vision, summarization. 7 task types.
Live token consumption, reset countdown, and cost estimation per provider.
Full dashboard with tokens, costs, trends over time. Filter by provider, model, or period.
Track spending with editable per-model pricing. Set budget alerts and limits.
Dashboard with healthcheck per provider, token validation, and auto-refresh status.
Golden set testing with 4 match strategies: exact, contains, regex, custom JS function.
Built-in Chat Tester and Test Bench. Test any model in real-time from the dashboard.
Configure Claude Code, Codex, OpenClaw, Kilo, Droid, and Cline directly from the dashboard.
Create unlimited model combinations with 6 balancing strategies. Fine-tune routing per combo.
Add multiple accounts per provider. Round-robin load balancing and automatic failover.
Full media generation: Image (NanoBanana, SD WebUI, ComfyUI), Video, Music, Audio Transcription (2GB, Deepgram, AssemblyAI), and Text-to-Speech (ElevenLabs, Cartesia, PlayHT).
Sync config across devices via Cloudflare Workers. 300+ global edge locations.
Create scoped API keys with model restrictions, time-based access schedules, and enable/disable toggles.
Organize provider connections by environment (dev/prod). Accordion view with smart auto-switch.
Model Context Protocol server with 16 agent-control tools. 3 transport modes: stdio, SSE, Streamable HTTP.
Agent-to-Agent orchestration with JSON-RPC 2.0, task streaming, SSE heartbeat, and smart-routing skill.
Native Electron app for Windows, macOS, and Linux. System tray, auto-update, offline support, single-instance lock.
3-tier pricing resolution synced from LiteLLM. User overrides → synced → defaults. Opt-in via settings.
Monitor everything in real-time. Manage providers, combos, analytics, and more.
Run locally, in a container, on a VM, or at the edge.
Install globally for local development
Container with persistent data volume
Deploy on Akamai, AWS, DigitalOcean
Edge deployment with D1 database