Skip to content

GopalGB/claw-model-router

Repository files navigation

🔀 Claw Model Router

Intelligent cost-aware routing engine for LLM API calls

Route requests to the optimal language model based on complexity, cost, and latency requirements. Built for production systems that use multiple LLM providers.

Features

  • Cost Optimization — Automatically routes simple queries to cheaper models (Haiku) and complex ones to powerful models (Opus)
  • Latency-Aware — Factors in response time requirements for real-time vs batch workloads
  • Multi-Provider — Supports Anthropic, OpenAI, NVIDIA NIM, Ollama, and OpenRouter
  • Fallback Chains — Automatic failover when a provider is down or rate-limited
  • Token Estimation — Pre-estimates token usage to pick the right context window

Quick Start

from model_router import ModelRouter

router = ModelRouter()

# Simple query → routes to Haiku (fast, cheap)
response = router.route("What is 2+2?")

# Complex query → routes to Opus (powerful)
response = router.route("Analyze this 50-page legal document and extract all liability clauses...")

Routing Strategy

Query Type Model Cost Latency
Simple Q&A Claude Haiku $0.25/M ~200ms
Code generation Claude Sonnet $3/M ~1s
Deep analysis Claude Opus $15/M ~3s
Fallback GPT-4o $5/M ~1s

Tech Stack

  • Python 3.11+
  • Anthropic SDK, OpenAI SDK
  • Token estimation via tiktoken

License

MIT

About

Intelligent cost-aware model router for OpenClaw

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages