diegosouzapw/omniroute

By diegosouzapw

โ€ขUpdated 37 minutes ago

OmniRoute โ€” Unified AI proxy. Route any LLM through one endpoint.

Image
5

50K+

diegosouzapw/omniroute repository overview

MseeP.ai Security Assessment Badge

โ ๐Ÿš€ OmniRoute โ€” The Free AI Gateway

โ Never stop coding. Smart routing to FREE & low-cost AI models with automatic fallback.

Your universal API proxy โ€” one endpoint, 160+ providers, zero downtime. Now with MCP Server (29 tools), A2A Protocol, Memory/Skills Systems & Electron Desktop App.

Chat Completions โ€ข Embeddings โ€ข Image Generation โ€ข Video โ€ข Music โ€ข Audio โ€ข Reranking โ€ข Web Search โ€ข MCP Server โ€ข A2A Protocol โ€ข 100% TypeScript


๐ŸŒ Available in: ๐Ÿ‡บ๐Ÿ‡ธ Englishโ  | ๐Ÿ‡ง๐Ÿ‡ท Portuguรชs (Brasil)โ  | ๐Ÿ‡ช๐Ÿ‡ธ Espaรฑolโ  | ๐Ÿ‡ซ๐Ÿ‡ท Franรงaisโ  | ๐Ÿ‡ฎ๐Ÿ‡น Italianoโ  | ๐Ÿ‡ท๐Ÿ‡บ ะ ัƒััะบะธะนโ  | ๐Ÿ‡จ๐Ÿ‡ณ ไธญๆ–‡ (็ฎ€ไฝ“)โ  | ๐Ÿ‡ฉ๐Ÿ‡ช Deutschโ  | ๐Ÿ‡ฎ๐Ÿ‡ณ เคนเคฟเคจเฅเคฆเฅ€โ  | ๐Ÿ‡น๐Ÿ‡ญ เน„เธ—เธขโ  | ๐Ÿ‡บ๐Ÿ‡ฆ ะฃะบั€ะฐั—ะฝััŒะบะฐโ  | ๐Ÿ‡ธ๐Ÿ‡ฆ ุงู„ุนุฑุจูŠุฉโ  | ๐Ÿ‡ฏ๐Ÿ‡ต ๆ—ฅๆœฌ่ชžโ  | ๐Ÿ‡ป๐Ÿ‡ณ Tiแบฟng Viแป‡tโ  | ๐Ÿ‡ง๐Ÿ‡ฌ ะ‘ัŠะปะณะฐั€ัะบะธโ  | ๐Ÿ‡ฉ๐Ÿ‡ฐ Danskโ  | ๐Ÿ‡ซ๐Ÿ‡ฎ Suomiโ  | ๐Ÿ‡ฎ๐Ÿ‡ฑ ืขื‘ืจื™ืชโ  | ๐Ÿ‡ญ๐Ÿ‡บ Magyarโ  | ๐Ÿ‡ฎ๐Ÿ‡ฉ Bahasa Indonesiaโ  | ๐Ÿ‡ฐ๐Ÿ‡ท ํ•œ๊ตญ์–ดโ  | ๐Ÿ‡ฒ๐Ÿ‡พ Bahasa Melayuโ  | ๐Ÿ‡ณ๐Ÿ‡ฑ Nederlandsโ  | ๐Ÿ‡ณ๐Ÿ‡ด Norskโ  | ๐Ÿ‡ต๐Ÿ‡น Portuguรชs (Portugal)โ  | ๐Ÿ‡ท๐Ÿ‡ด Romรขnฤƒโ  | ๐Ÿ‡ต๐Ÿ‡ฑ Polskiโ  | ๐Ÿ‡ธ๐Ÿ‡ฐ Slovenฤinaโ  | ๐Ÿ‡ธ๐Ÿ‡ช Svenskaโ  | ๐Ÿ‡ต๐Ÿ‡ญ Filipinoโ  | ๐Ÿ‡จ๐Ÿ‡ฟ ฤŒeลกtinaโ 


โ ๐Ÿ–ผ๏ธ Main Dashboard

OmniRoute Dashboard

โ ๐Ÿ“ธ Dashboard Preview

Click to see dashboard screenshots
PageScreenshot
ProvidersProviders
CombosCombos
AnalyticsAnalytics
HealthHealth
TranslatorTranslator
SettingsSettings
CLI ToolsCLI Tools
Usage LogsUsage
EndpointsEndpoints

โ ๐Ÿค– Free AI Provider for your favorite coding agents

Connect any AI-powered IDE or CLI tool through OmniRoute โ€” free API gateway for unlimited coding.

OpenClaw
OpenClaw โ 

โญ 205K
NanoBot
NanoBot โ 

โญ 20.9K
PicoClaw
PicoClaw โ 

โญ 14.6K
ZeroClaw
ZeroClaw โ 

โญ 9.9K
IronClaw
IronClaw โ 

โญ 2.1K
OpenCode
OpenCode โ 

โญ 106K
Codex CLI
Codex CLI โ 

โญ 60.8K
Claude Code
Claude Code โ 

โญ 67.3K
Gemini CLI
Gemini CLI โ 

โญ 94.7K
Kilo Code
Kilo Code โ 

โญ 15.5K

๐Ÿ“ก All agents connect via http://localhost:20128/v1โ  or http://cloud.omniroute.online/v1โ  โ€” one config, unlimited models and quota


โ ๐Ÿค” Why OmniRoute?

Stop wasting money and hitting limits:

  • Subscription quota expires unused every month
  • Rate limits stop you mid-coding
  • Expensive APIs ($20-50/month per provider)
  • Manual switching between providers

OmniRoute solves this:

  • โœ… Maximize subscriptions - Track quota, use every bit before reset
  • โœ… Auto fallback - Subscription โ†’ API Key โ†’ Cheap โ†’ Free, zero downtime
  • โœ… Multi-account - Round-robin between accounts per provider
  • โœ… Universal - Works with Claude Code, Codex, Gemini CLI, Cursor, Cline, OpenClaw, any CLI tool

โ ๐Ÿ“ง Support

๐Ÿ’ฌ Join our community! WhatsApp Groupโ  โ€” Get help, share tips, and stay updated.

โ ๐Ÿ› Reporting a Bug?

When opening an issue, please run the system-info command and attach the generated file:

npm run system-info

This generates a system-info.txt with your Node.js version, OmniRoute version, OS details, installed CLI tools (qoder, gemini, claude, codex, antigravity, droid, etc.), Docker/PM2 status, and system packages โ€” everything we need to reproduce your issue quickly. Attach the file directly to your GitHub issue.


โ ๐Ÿ”„ How It Works

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  Your CLI   โ”‚  (Claude Code, Codex, Gemini CLI, OpenClaw, Cursor, Cline...)
โ”‚   Tool      โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”˜
       โ”‚ http://localhost:20128/v1
       โ†“
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚           OmniRoute (Smart Router)        โ”‚
โ”‚  โ€ข Format translation (OpenAI โ†” Claude) โ”‚
โ”‚  โ€ข Quota tracking + Embeddings + Images โ”‚
โ”‚  โ€ข Auto token refresh                   โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
       โ”‚
       โ”œโ”€โ†’ [Tier 1: SUBSCRIPTION] Claude Code, Codex, Gemini CLI
       โ”‚   โ†“ quota exhausted
       โ”œโ”€โ†’ [Tier 2: API KEY] DeepSeek, Groq, xAI, Mistral, NVIDIA NIM, etc.
       โ”‚   โ†“ budget limit
       โ”œโ”€โ†’ [Tier 3: CHEAP] GLM ($0.6/1M), MiniMax ($0.2/1M)
       โ”‚   โ†“ budget limit
       โ””โ”€โ†’ [Tier 4: FREE] Qoder, Qwen, Kiro (unlimited)

Result: Never stop coding, minimal cost

โ ๐ŸŽฏ What OmniRoute Solves โ€” 30 Real Pain Points & Use Cases

Every developer using AI tools faces these problems daily. OmniRoute was built to solve them all โ€” from cost overruns to regional blocks, from broken OAuth flows to protocol operations and enterprise observability.

๐Ÿ’ธ 1. "I pay for an expensive subscription but still get interrupted by limits"

Developers pay $20โ€“200/month for Claude Pro, Codex Pro, or GitHub Copilot. Even paying, quota has a ceiling โ€” 5h of usage, weekly limits, or per-minute rate limits. Mid-coding session, the provider stops responding and the developer loses flow and productivity.

How OmniRoute solves it:

  • Smart 4-Tier Fallback โ€” If subscription quota runs out, automatically redirects to API Key โ†’ Cheap โ†’ Free with zero manual intervention
  • Provider Limits Tracking โ€” Cached quota snapshots refresh on a server-side schedule (default PROVIDER_LIMITS_SYNC_INTERVAL_MINUTES=70) with manual refresh available in the UI
  • Multi-Account Support โ€” Multiple accounts per provider with auto round-robin โ€” when one runs out, switches to the next
  • Custom Combos โ€” Customizable fallback chains with 13 balancing strategies (priority, weighted, fill-first, round-robin, P2C, random, least-used, cost-optimized, strict-random, auto, lkgp, context-optimized, context-relay)
  • Structured Combo Builder โ€” Build combos with guided steps or expert single-page editing, including explicit provider + model + account selection, repeated providers, fixed-account targets, and direct model entry
  • Quota-Aware P2C โ€” Power-of-two account selection now factors quota headroom, backoff, recent errors, and consecutive use
  • Codex Business Quotas โ€” Business/Team workspace quota monitoring directly in the dashboard
๐Ÿ”Œ 2. "I need to use multiple providers but each has a different API"

OpenAI uses one format, Claude (Anthropic) uses another, Gemini yet another. If a dev wants to test models from different providers or fallback between them, they need to reconfigure SDKs, change endpoints, deal with incompatible formats. Custom providers (FriendLI, NIM) have non-standard model endpoints.

How OmniRoute solves it:

  • Unified Endpoint โ€” A single http://localhost:20128/v1 serves as proxy for all 160+ providers
  • Format Translation โ€” Automatic and transparent: OpenAI โ†” Claude โ†” Gemini โ†” Responses API
  • Response Sanitization โ€” Strips non-standard fields (x_groq, usage_breakdown, service_tier) that break OpenAI SDK v1.83+
  • Role Normalization โ€” Converts developer โ†’ system for non-OpenAI providers; system โ†’ user for GLM/ERNIE
  • Think Tag Extraction โ€” Extracts <think> blocks from models like DeepSeek R1 into standardized reasoning_content
  • Structured Output for Gemini โ€” json_schema โ†’ responseMimeType/responseSchema automatic conversion
  • stream defaults to false โ€” Aligns with OpenAI spec, avoiding unexpected SSE in Python/Rust/Go SDKs
๐ŸŒ 3. "My AI provider blocks my region/country"

Providers like OpenAI/Codex block access from certain geographic regions. Users get errors like unsupported_country_region_territory during OAuth and API connections. This is especially frustrating for developers from developing countries.

How OmniRoute solves it:

  • 3-Level Proxy Config โ€” Configurable proxy at 3 levels: global (all traffic), per-provider (one provider only), and per-connection/key
  • Color-Coded Proxy Badges โ€” Visual indicators: ๐ŸŸข global proxy, ๐ŸŸก provider proxy, ๐Ÿ”ต connection proxy, always showing the IP
  • OAuth Token Exchange Through Proxy โ€” OAuth flow also goes through the proxy, solving unsupported_country_region_territory
  • Connection Tests via Proxy โ€” Connection tests use the configured proxy (no more direct bypass)
  • SOCKS5 Support โ€” Full SOCKS5 proxy support for outbound routing
  • TLS Fingerprint Spoofing โ€” Browser-like TLS fingerprint via wreq-js to bypass bot detection
  • ๐Ÿ” CLI Fingerprint Matching โ€” Reorders headers and body fields to match native CLI binary signatures, drastically reducing account flagging risk. The proxy IP is preserved โ€” you get both stealth and IP masking simultaneously
๐Ÿ†“ 4. "I want to use AI for coding but I have no money"

Not everyone can pay $20โ€“200/month for AI subscriptions. Students, devs from emerging countries, hobbyists, and freelancers need access to quality models at zero cost.

How OmniRoute solves it:

  • Free Tier Providers Built-in โ€” Native support for 100% free providers: Qoder (5 unlimited models via OAuth: kimi-k2-thinking, qwen3-coder-plus, deepseek-r1, minimax-m2, kimi-k2), Qwen (4 unlimited models: qwen3-coder-plus, qwen3-coder-flash, qwen3-coder-next, vision-model), Kiro (Claude + AWS Builder ID for free), Gemini CLI (180K tokens/month free)
  • Ollama Cloud โ€” Cloud-hosted Ollama models at api.ollama.com with free "Light usage" tier; use ollamacloud/<model> prefix
  • Free-Only Combos โ€” Chain gc/gemini-3-flash โ†’ if/kimi-k2-thinking โ†’ qw/qwen3-coder-plus = $0/month with zero downtime
  • NVIDIA NIM Free Access โ€” ~40 RPM dev-forever free access to 70+ models at build.nvidia.com (transitioning from credits to pure rate limits)
  • Cost Optimized Strategy โ€” Routing strategy that automatically chooses the cheapest available provider
๐Ÿ”’ 5. "I need to protect my AI gateway from unauthorized access"

When exposing an AI gateway to the network (LAN, VPS, Docker), anyone with the address can consume the developer's tokens/quota. Without protection, APIs are vulnerable to misuse, prompt injection, and abuse.

How OmniRoute solves it:

  • API Key Management โ€” Generation, rotation, and scoping per provider with a dedicated /dashboard/api-manager page
  • Model-Level Permissions โ€” Restrict API keys to specific models (openai/*, wildcard patterns), with Allow All/Restrict toggle
  • API Endpoint Protection โ€” Require a key for /v1/models and block specific providers from the listing
  • Auth Guard + CSRF Protection โ€” All dashboard routes protected with withAuth middleware + CSRF tokens
  • Rate Limiter โ€” Per-IP rate limiting with configurable windows
  • IP Filtering โ€” Allowlist/blocklist for access control
  • Prompt Injection Guard โ€” Sanitization against malicious prompt patterns
  • AES-256-GCM Encryption โ€” Credentials encrypted at rest
๐Ÿ›‘ 6. "My provider went down and I lost my coding flow"

AI providers can become unstable, return 5xx errors, or hit temporary rate limits. If a dev depends on a single provider, they're interrupted. Without circuit breakers, repeated retries can crash the application.

How OmniRoute solves it:

  • Request Queue & Pacing โ€” Per-connection request buckets smooth bursts before they hit upstream rate caps
  • Connection Cooldown โ€” A single connection cools down after retryable failures with optional upstream Retry-After hints and exponential backoff
  • Provider Circuit Breaker โ€” The provider only trips after fallback is exhausted and the provider request still fails with provider-wide transient errors; connection-scoped 429 rate limits stay in Connection Cooldown
  • Wait For Cooldown โ€” The server can wait for the earliest connection cooldown to expire and retry the same client request automatically
  • Anti-Thundering Herd โ€” Mutex + semaphore protection against concurrent retry storms
  • Combo Fallback Chains โ€” If the primary provider fails, automatically falls through the chain with no intervention
  • Health Dashboard โ€” Uptime monitoring, provider circuit breaker states, cooldowns, cache stats, p50/p95/p99 latency
๐Ÿ”ง 7. "Configuring each AI tool is tedious and repetitive"

Developers use Cursor, Claude Code, Codex CLI, OpenClaw, Gemini CLI, Kilo Code... Each tool needs a different config (API endpoint, key, model). Reconfiguring when switching providers or models is a waste of time.

How OmniRoute solves it:

  • CLI Tools Dashboard โ€” Dedicated page with one-click setup for Claude Code, Codex CLI, OpenClaw, Kilo Code, Antigravity, Cline
  • GitHub Copilot Config Generator โ€” Generates chatLanguageModels.json for VS Code with bulk model selection
  • Onboarding Wizard โ€” Guided 4-step setup for first-time users
  • One endpoint, all models โ€” Configure http://localhost:20128/v1 once, access 160+ providers
๐Ÿ”‘ 8. "Managing OAuth tokens from multiple providers is hell"

Claude Code, Codex, Gemini CLI, Copilot โ€” all use OAuth 2.0 with expiring tokens. Developers need to re-authenticate constantly, deal with client_secret is missing, redirect_uri_mismatch, and failures on remote servers. OAuth on LAN/VPS is particularly problematic.

How OmniRoute solves it:

  • Auto Token Refresh โ€” OAuth tokens refresh in background before expiration
  • OAuth 2.0 (PKCE) Built-in โ€” Automatic flow for Claude Code, Codex, Gemini CLI, Copilot, Kiro, Qwen, Qoder
  • Multi-Account OAuth โ€” Multiple accounts per provider via JWT/ID token extraction
  • OAuth LAN/Remote Fix โ€” Private IP detection for redirect_uri + manual URL mode for remote servers
  • OAuth Behind Nginx โ€” Uses window.location.origin for reverse proxy compatibility
  • Remote OAuth Guide โ€” Step-by-step guide for Google Cloud credentials on VPS/Docker
๐Ÿ“Š 9. "I don't know how much I'm spending or where"

Developers use multiple paid providers but have no unified view of spending. Each provider has its own billing dashboard, but there's no consolidated view. Unexpected costs can pile up.

How OmniRoute solves it:

  • Cost Analytics Dashboard โ€” Per-token cost tracking and budget management per provider
  • Budget Limits per Tier โ€” Spending ceiling per tier that triggers automatic fallback
  • Per-Model Pricing Configuration โ€” Configurable prices per model
  • Usage Statistics Per API Key โ€” Request count and last-used timestamp per key
  • Analytics Dashboard โ€” Stat cards, model usage chart, provider table with success rates and latency
๐Ÿ› 10. "I can't diagnose errors and problems in AI calls"

When a call fails, the dev doesn't know if it was a rate limit, expired token, wrong format, or provider error. Fragmented logs across different terminals. Without observability, debugging is trial-and-error.

How OmniRoute solves it:

  • Unified Logs Dashboard โ€” 4 tabs: Request Logs, Proxy Logs, Audit Logs, Console
  • Console Log Viewer โ€” Real-time terminal-style viewer with color-coded levels, auto-scroll, search, filter
  • SQLite Summary Logs โ€” Request and proxy log indexes stay queryable across restarts without loading large payload blobs into SQLite
  • Translator Playground โ€” 4 debugging modes: Playground (format translation), Chat Tester (round-trip), Test Bench (batch), Live Monitor (real-time)
  • Request Telemetry โ€” p50/p95/p99 latency + X-Request-Id tracing
  • File-Based Detail Artifacts โ€” App logs rotate by size, retention days, and archive count; detailed request/respon

Tag summary

Content type

Image

Digest

sha256:be909c8eeโ€ฆ

Size

165.4 MB

Last updated

37 minutes ago

docker pull diegosouzapw/omniroute