diegosouzapw/omniroute

By diegosouzapw

•Updated 37 minutes ago

OmniRoute — Unified AI proxy. Route any LLM through one endpoint.

Image

50K+

Overview Tags

diegosouzapw/omniroute repository overview

⁠🚀 OmniRoute — The Free AI Gateway

⁠Never stop coding. Smart routing to FREE & low-cost AI models with automatic fallback.

Your universal API proxy — one endpoint, 160+ providers, zero downtime. Now with MCP Server (29 tools), A2A Protocol, Memory/Skills Systems & Electron Desktop App.

Chat Completions • Embeddings • Image Generation • Video • Music • Audio • Reranking • Web Search • MCP Server • A2A Protocol • 100% TypeScript

NPM Downloads

NPM Downloads Docker Pulls GitHub Downloads (all assets, all releases)

🌐 Website⁠ • 🚀 Quick Start⁠ • 💡 Features⁠ • 📖 Docs⁠ • 💰 Pricing⁠ • 💬 WhatsApp⁠

⁠🖼️ Main Dashboard

⁠📸 Dashboard Preview

Click to see dashboard screenshots

Page	Screenshot
Providers
Combos
Analytics
Health
Translator
Settings
CLI Tools
Usage Logs
Endpoints

⁠🤖 Free AI Provider for your favorite coding agents

Connect any AI-powered IDE or CLI tool through OmniRoute — free API gateway for unlimited coding.

OpenClaw ⁠ _{⭐ 205K}	NanoBot ⁠ _{⭐ 20.9K}	PicoClaw ⁠ _{⭐ 14.6K}	ZeroClaw ⁠ _{⭐ 9.9K}	IronClaw ⁠ _{⭐ 2.1K}
OpenCode ⁠ _{⭐ 106K}	Codex CLI ⁠ _{⭐ 60.8K}	Claude Code ⁠ _{⭐ 67.3K}	Gemini CLI ⁠ _{⭐ 94.7K}	Kilo Code ⁠ _{⭐ 15.5K}

_{📡 All agents connect via http://localhost:20128/v1⁠ or http://cloud.omniroute.online/v1⁠ — one config, unlimited models and quota}

⁠🤔 Why OmniRoute?

Stop wasting money and hitting limits:

Subscription quota expires unused every month
Rate limits stop you mid-coding
Expensive APIs ($20-50/month per provider)
Manual switching between providers

OmniRoute solves this:

✅ Maximize subscriptions - Track quota, use every bit before reset
✅ Auto fallback - Subscription → API Key → Cheap → Free, zero downtime
✅ Multi-account - Round-robin between accounts per provider
✅ Universal - Works with Claude Code, Codex, Gemini CLI, Cursor, Cline, OpenClaw, any CLI tool

⁠📧 Support

💬 Join our community! WhatsApp Group⁠ — Get help, share tips, and stay updated.

Website: omniroute.online⁠
GitHub: github.com/diegosouzapw/OmniRoute⁠
Issues: github.com/diegosouzapw/OmniRoute/issues⁠
WhatsApp: Community Group⁠
Contributing: See CONTRIBUTING.md⁠, open a PR, or pick a good first issue
Original Project: 9router by decolua⁠

⁠🐛 Reporting a Bug?

When opening an issue, please run the system-info command and attach the generated file:

npm run system-info

This generates a system-info.txt with your Node.js version, OmniRoute version, OS details, installed CLI tools (qoder, gemini, claude, codex, antigravity, droid, etc.), Docker/PM2 status, and system packages — everything we need to reproduce your issue quickly. Attach the file directly to your GitHub issue.

⁠🔄 How It Works

┌─────────────┐
│  Your CLI   │  (Claude Code, Codex, Gemini CLI, OpenClaw, Cursor, Cline...)
│   Tool      │
└──────┬──────┘
       │ http://localhost:20128/v1
       ↓
┌─────────────────────────────────────────┐
│           OmniRoute (Smart Router)        │
│  • Format translation (OpenAI ↔ Claude) │
│  • Quota tracking + Embeddings + Images │
│  • Auto token refresh                   │
└──────┬──────────────────────────────────┘
       │
       ├─→ [Tier 1: SUBSCRIPTION] Claude Code, Codex, Gemini CLI
       │   ↓ quota exhausted
       ├─→ [Tier 2: API KEY] DeepSeek, Groq, xAI, Mistral, NVIDIA NIM, etc.
       │   ↓ budget limit
       ├─→ [Tier 3: CHEAP] GLM ($0.6/1M), MiniMax ($0.2/1M)
       │   ↓ budget limit
       └─→ [Tier 4: FREE] Qoder, Qwen, Kiro (unlimited)

Result: Never stop coding, minimal cost

⁠🎯 What OmniRoute Solves — 30 Real Pain Points & Use Cases

Every developer using AI tools faces these problems daily. OmniRoute was built to solve them all — from cost overruns to regional blocks, from broken OAuth flows to protocol operations and enterprise observability.

💸 1. "I pay for an expensive subscription but still get interrupted by limits"

Developers pay $20–200/month for Claude Pro, Codex Pro, or GitHub Copilot. Even paying, quota has a ceiling — 5h of usage, weekly limits, or per-minute rate limits. Mid-coding session, the provider stops responding and the developer loses flow and productivity.

How OmniRoute solves it:

Smart 4-Tier Fallback — If subscription quota runs out, automatically redirects to API Key → Cheap → Free with zero manual intervention
Provider Limits Tracking — Cached quota snapshots refresh on a server-side schedule (default PROVIDER_LIMITS_SYNC_INTERVAL_MINUTES=70) with manual refresh available in the UI
Multi-Account Support — Multiple accounts per provider with auto round-robin — when one runs out, switches to the next
Custom Combos — Customizable fallback chains with 13 balancing strategies (priority, weighted, fill-first, round-robin, P2C, random, least-used, cost-optimized, strict-random, auto, lkgp, context-optimized, context-relay)
Structured Combo Builder — Build combos with guided steps or expert single-page editing, including explicit provider + model + account selection, repeated providers, fixed-account targets, and direct model entry
Quota-Aware P2C — Power-of-two account selection now factors quota headroom, backoff, recent errors, and consecutive use
Codex Business Quotas — Business/Team workspace quota monitoring directly in the dashboard

🔌 2. "I need to use multiple providers but each has a different API"

OpenAI uses one format, Claude (Anthropic) uses another, Gemini yet another. If a dev wants to test models from different providers or fallback between them, they need to reconfigure SDKs, change endpoints, deal with incompatible formats. Custom providers (FriendLI, NIM) have non-standard model endpoints.

How OmniRoute solves it:

Unified Endpoint — A single http://localhost:20128/v1 serves as proxy for all 160+ providers
Format Translation — Automatic and transparent: OpenAI ↔ Claude ↔ Gemini ↔ Responses API
Response Sanitization — Strips non-standard fields (x_groq, usage_breakdown, service_tier) that break OpenAI SDK v1.83+
Role Normalization — Converts developer → system for non-OpenAI providers; system → user for GLM/ERNIE
Think Tag Extraction — Extracts <think> blocks from models like DeepSeek R1 into standardized reasoning_content
Structured Output for Gemini — json_schema → responseMimeType/responseSchema automatic conversion
stream defaults to false — Aligns with OpenAI spec, avoiding unexpected SSE in Python/Rust/Go SDKs

🌐 3. "My AI provider blocks my region/country"

Providers like OpenAI/Codex block access from certain geographic regions. Users get errors like unsupported_country_region_territory during OAuth and API connections. This is especially frustrating for developers from developing countries.

How OmniRoute solves it:

3-Level Proxy Config — Configurable proxy at 3 levels: global (all traffic), per-provider (one provider only), and per-connection/key
Color-Coded Proxy Badges — Visual indicators: 🟢 global proxy, 🟡 provider proxy, 🔵 connection proxy, always showing the IP
OAuth Token Exchange Through Proxy — OAuth flow also goes through the proxy, solving unsupported_country_region_territory
Connection Tests via Proxy — Connection tests use the configured proxy (no more direct bypass)
SOCKS5 Support — Full SOCKS5 proxy support for outbound routing
TLS Fingerprint Spoofing — Browser-like TLS fingerprint via wreq-js to bypass bot detection
🔏 CLI Fingerprint Matching — Reorders headers and body fields to match native CLI binary signatures, drastically reducing account flagging risk. The proxy IP is preserved — you get both stealth and IP masking simultaneously

🆓 4. "I want to use AI for coding but I have no money"

Not everyone can pay $20–200/month for AI subscriptions. Students, devs from emerging countries, hobbyists, and freelancers need access to quality models at zero cost.

How OmniRoute solves it:

Free Tier Providers Built-in — Native support for 100% free providers: Qoder (5 unlimited models via OAuth: kimi-k2-thinking, qwen3-coder-plus, deepseek-r1, minimax-m2, kimi-k2), Qwen (4 unlimited models: qwen3-coder-plus, qwen3-coder-flash, qwen3-coder-next, vision-model), Kiro (Claude + AWS Builder ID for free), Gemini CLI (180K tokens/month free)
Ollama Cloud — Cloud-hosted Ollama models at api.ollama.com with free "Light usage" tier; use ollamacloud/<model> prefix
Free-Only Combos — Chain gc/gemini-3-flash → if/kimi-k2-thinking → qw/qwen3-coder-plus = $0/month with zero downtime
NVIDIA NIM Free Access — ~40 RPM dev-forever free access to 70+ models at build.nvidia.com (transitioning from credits to pure rate limits)
Cost Optimized Strategy — Routing strategy that automatically chooses the cheapest available provider

🔒 5. "I need to protect my AI gateway from unauthorized access"

When exposing an AI gateway to the network (LAN, VPS, Docker), anyone with the address can consume the developer's tokens/quota. Without protection, APIs are vulnerable to misuse, prompt injection, and abuse.

How OmniRoute solves it:

API Key Management — Generation, rotation, and scoping per provider with a dedicated /dashboard/api-manager page
Model-Level Permissions — Restrict API keys to specific models (openai/*, wildcard patterns), with Allow All/Restrict toggle
API Endpoint Protection — Require a key for /v1/models and block specific providers from the listing
Auth Guard + CSRF Protection — All dashboard routes protected with withAuth middleware + CSRF tokens
Rate Limiter — Per-IP rate limiting with configurable windows
IP Filtering — Allowlist/blocklist for access control
Prompt Injection Guard — Sanitization against malicious prompt patterns
AES-256-GCM Encryption — Credentials encrypted at rest

🛑 6. "My provider went down and I lost my coding flow"

AI providers can become unstable, return 5xx errors, or hit temporary rate limits. If a dev depends on a single provider, they're interrupted. Without circuit breakers, repeated retries can crash the application.

How OmniRoute solves it:

Request Queue & Pacing — Per-connection request buckets smooth bursts before they hit upstream rate caps
Connection Cooldown — A single connection cools down after retryable failures with optional upstream Retry-After hints and exponential backoff
Provider Circuit Breaker — The provider only trips after fallback is exhausted and the provider request still fails with provider-wide transient errors; connection-scoped 429 rate limits stay in Connection Cooldown
Wait For Cooldown — The server can wait for the earliest connection cooldown to expire and retry the same client request automatically
Anti-Thundering Herd — Mutex + semaphore protection against concurrent retry storms
Combo Fallback Chains — If the primary provider fails, automatically falls through the chain with no intervention
Health Dashboard — Uptime monitoring, provider circuit breaker states, cooldowns, cache stats, p50/p95/p99 latency

🔧 7. "Configuring each AI tool is tedious and repetitive"

Developers use Cursor, Claude Code, Codex CLI, OpenClaw, Gemini CLI, Kilo Code... Each tool needs a different config (API endpoint, key, model). Reconfiguring when switching providers or models is a waste of time.

How OmniRoute solves it:

CLI Tools Dashboard — Dedicated page with one-click setup for Claude Code, Codex CLI, OpenClaw, Kilo Code, Antigravity, Cline
GitHub Copilot Config Generator — Generates chatLanguageModels.json for VS Code with bulk model selection
Onboarding Wizard — Guided 4-step setup for first-time users
One endpoint, all models — Configure http://localhost:20128/v1 once, access 160+ providers

🔑 8. "Managing OAuth tokens from multiple providers is hell"

Claude Code, Codex, Gemini CLI, Copilot — all use OAuth 2.0 with expiring tokens. Developers need to re-authenticate constantly, deal with client_secret is missing, redirect_uri_mismatch, and failures on remote servers. OAuth on LAN/VPS is particularly problematic.

How OmniRoute solves it:

Auto Token Refresh — OAuth tokens refresh in background before expiration
OAuth 2.0 (PKCE) Built-in — Automatic flow for Claude Code, Codex, Gemini CLI, Copilot, Kiro, Qwen, Qoder
Multi-Account OAuth — Multiple accounts per provider via JWT/ID token extraction
OAuth LAN/Remote Fix — Private IP detection for redirect_uri + manual URL mode for remote servers
OAuth Behind Nginx — Uses window.location.origin for reverse proxy compatibility
Remote OAuth Guide — Step-by-step guide for Google Cloud credentials on VPS/Docker

📊 9. "I don't know how much I'm spending or where"

Developers use multiple paid providers but have no unified view of spending. Each provider has its own billing dashboard, but there's no consolidated view. Unexpected costs can pile up.

How OmniRoute solves it:

Cost Analytics Dashboard — Per-token cost tracking and budget management per provider
Budget Limits per Tier — Spending ceiling per tier that triggers automatic fallback
Per-Model Pricing Configuration — Configurable prices per model
Usage Statistics Per API Key — Request count and last-used timestamp per key
Analytics Dashboard — Stat cards, model usage chart, provider table with success rates and latency

🐛 10. "I can't diagnose errors and problems in AI calls"

When a call fails, the dev doesn't know if it was a rate limit, expired token, wrong format, or provider error. Fragmented logs across different terminals. Without observability, debugging is trial-and-error.

How OmniRoute solves it:

Unified Logs Dashboard — 4 tabs: Request Logs, Proxy Logs, Audit Logs, Console
Console Log Viewer — Real-time terminal-style viewer with color-coded levels, auto-scroll, search, filter
SQLite Summary Logs — Request and proxy log indexes stay queryable across restarts without loading large payload blobs into SQLite
Translator Playground — 4 debugging modes: Playground (format translation), Chat Tester (round-trip), Test Bench (batch), Live Monitor (real-time)
Request Telemetry — p50/p95/p99 latency + X-Request-Id tracing
File-Based Detail Artifacts — App logs rotate by size, retention days, and archive count; detailed request/respon

Tag summary

Recent tags

Content type

Image

Digest

sha256:be909c8ee…

Size

165.4 MB

Last updated

37 minutes ago

docker pull diegosouzapw/omniroute