What can workflows do?

Scoped workflows let you control cache, audit logging, usage tracking, guardrails, and fallback behavior by provider, model, or user path. That means teams can apply different runtime policies without rewriting their apps.

Does GoModel support aliases and caching?

Yes. GoModel supports model aliases so you can keep stable model names in your apps and remap them centrally. It also ships with exact-match response caching today, with cache awareness integrated into the gateway runtime and admin visibility.

Can I track usage per team, app, or user path?

Yes. GoModel records usage, audit, and cache information in the admin layer, and supports per-user-path tracking so you can break down requests and cost by team, application, tenant, or feature.

How do I deploy GoModel?

The fastest way is Docker. You can also run the single binary directly or deploy with Docker Compose or Kubernetes. The same gateway can start small and grow into a production setup with Redis and PostgreSQL.

AI Gateway for
Tracking, Decoupling, Debugging
your AI Usage

Read the Docs Talk to Us View on GitHub

Try it in 10 seconds

docker run --rm -p 8080:8080 \
  -e LOGGING_ENABLED=true \
  -e LOGGING_LOG_BODIES=true \
  -e LOG_FORMAT=text \
  -e LOGGING_LOG_HEADERS=true \
  -e OPENAI_API_KEY="your-openai-key" \
  enterpilot/gomodel

open http://localhost:8080/admin/dashboard

Why GoModel

Problems GoModel Solves

When provider switching, debugging, and usage tracking start leaking into application code, GoModel moves that logic into one gateway layer.

Your app is coupled to one provider

Switching from OpenAI to Anthropic, Groq, or another backend becomes a code project instead of a config change. GoModel decouples provider choice from the app so you can switch models and vendors behind one stable API.

Every team needs different gateway behavior

One model should use cache, another needs audit, and one customer path requires stricter guardrails. GoModel ships scoped workflows that let you control cache, audit, usage, guardrails, and fallback by provider, model, or user path.

Repeated prompts keep burning budget

Teams resend the same non-streaming requests and pay full price every time. GoModel adds exact-match response caching after request planning, so duplicate calls return faster and cheaper.

Cost data stops at the provider dashboard

You know what OpenAI or Anthropic charged, but not which team, tenant, or feature caused it. GoModel tracks usage by request and user path so cost can finally map back to real product traffic.

Incidents are hard to reconstruct

A prompt changed, a provider fell back, or cache got bypassed, but nobody can explain what happened. GoModel keeps audit logs and runtime metadata so failures are traceable instead of anecdotal.

The gateway becomes its own platform problem

If operating the gateway needs more engineering than operating the app, you picked the wrong gateway. GoModel stays simple to run with one binary, built-in admin UI, and storage options that grow with the workload.

How It Works

How GoModel Works

GoModel authenticates requests, resolves aliases, applies scoped workflows, and routes traffic across AI model providers through one OpenAI-compatible gateway.

POST http://localhost:8080/v1/chat/completions

Same OpenAI-compatible entry point. Change the base URL, keep your SDK.

Features

GoModel Features

Explore the open-source AI gateway features that help teams decouple providers from their apps, manage routing centrally, and track usage with less operational overhead.

Scoped Workflows

Control cache, audit logging, usage tracking, guardrails, and fallback per provider, model, or user path instead of forcing one policy onto every request.

Model Aliases

Publish stable names like smart-chat or cheap-rag and remap the real provider and model behind the scenes without touching application code.

Exact Response Cache

Repeated requests can return from cache after alias and workflow resolution, cutting duplicate model calls, latency, and cost.

Per-User Usage Tracking

Attach X-GoModel-User-Path and break down traffic by team, app, tenant, or feature in the admin API, dashboard, and reporting views.

Audit Logs + Dashboard

Inspect request history, provider routing, cache hits, aliases, workflows, and usage analytics from the built-in admin UI and REST endpoints.

Flexible Deployment

Start with one binary and simple local storage, then move to PostgreSQL, MongoDB, and Redis as your GoModel deployment grows in traffic, retention, and operational needs.

Quick Start

Quick Start: Deploy GoModel

Launch GoModel, open the dashboard, and send your first OpenAI-compatible request in three steps.

Run GoModel

Use the README Docker command for a quick start, or bring up the full local stack with Docker Compose.

Docker

docker run --rm -p 8080:8080 \
  -e LOGGING_ENABLED=true \
  -e LOGGING_LOG_BODIES=true \
  -e LOG_FORMAT=text \
  -e LOGGING_LOG_HEADERS=true \
  -e OPENAI_API_KEY="your-openai-key" \
  enterpilot/gomodel

Docker Compose

cp .env.template .env
# Add your API keys to .env
docker compose up -d

Open the dashboard

Inspect models, aliases, workflows, usage, cache, and audit logs in the admin UI once the gateway is running.

http://localhost:8080/admin/dashboard Open dashboard

Send your first API call

Keep the OpenAI-compatible request shape and point it at GoModel.

API Call

curl http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5-chat-latest",
    "messages": [{
      "role": "user",
      "content": "Hello!"
    }]
  }'

Comparison

GoModel vs LiteLLM Comparison

Compare GoModel and LiteLLM across performance, deployment simplicity, and the operational features teams need when they want to switch AI model providers more easily.

In benchmarks GoModel delivers 47% higher throughput (52.75 vs 35.81 req/s), 46% lower p95 latency at concurrency 8 (130ms vs 244ms), and uses 7x less memory (45 MB vs 321 MB). Read the full benchmark →

Feature	GoModel	LiteLLM
Language	Go	Python
Deployment	Single binary	pip install + runtime
Concurrency	Goroutines (10k+)	asyncio event loop
JSON Performance	Sonic (2-3x faster)	Standard library
Config	Env vars + optional YAML	YAML config file
Cluster Mode	PostgreSQL + MongoDB + Redis	PostgreSQL + Redis
Metrics	Prometheus + admin dashboard	Prometheus via plugin
License	MIT	MIT core + enterprise license

Providers

Supported AI Model Providers

Connect OpenAI, Anthropic, Gemini, Groq, OpenRouter, xAI, Azure OpenAI, Oracle, and Ollama through one OpenAI-compatible AI gateway.

OPENAI_API_KEY

OpenAI

gpt-4o-mini

ANTHROPIC_API_KEY

Anthropic

claude-sonnet-4-20250514

GEMINI_API_KEY

Gemini

gemini-2.5-flash

GROQ_API_KEY

Groq

llama-3.3-70b-versatile

OPENROUTER_API_KEY

OpenRouter

google/gemini-2.5-flash

XAI_API_KEY

xAI

grok-2

AZURE_API_KEY + BASE_URL

Azure OpenAI

gpt-4o

ORACLE_API_KEY + BASE_URL

Oracle

Experimental

openai.gpt-oss-120b

OLLAMA_BASE_URL

Ollama

llama3.2

CUSTOM BASE URL

... and more

Other OpenAI-compatible backends

GoModel auto-discovers configured providers and can also route to additional OpenAI-compatible backends through their base URLs.

Roadmap

GoModel Roadmap

See what is already shipped in the GoModel AI gateway, what is currently in progress, and what is planned next.

Live

Shipped Today

Aliases Scoped workflows Exact cache Usage analytics Audit logs Admin dashboard Many keys

WIP

In Progress

Semantic cache Billing workflows Budget controls Deeper guardrails OpenTelemetry

Planned

Cluster mode SSO / OIDC Expanded policy controls

FAQ

GoModel FAQ

Common questions about deployment, provider switching, aliases, workflows, caching, and usage tracking in GoModel.

What is GoModel?

GoModel is an open-source AI gateway written in Go. It gives your apps one OpenAI-compatible endpoint, then adds aliases, scoped workflows, exact-match caching, audit logs, and per-user usage tracking behind that single entry point.

Can I keep using the OpenAI SDK with GoModel?

Yes. In most cases you only change the base URL. That keeps application code stable while provider routing, aliases, workflows, audit, and usage visibility move into the gateway.

What do workflows actually control?

Scoped workflows can enable or disable cache, audit logging, usage tracking, guardrails, and fallback for specific providers, models, or user paths. That lets you keep one gateway while still running different runtime policies for different workloads.

Does GoModel support aliases and response caching?

Yes. You can define aliases so apps use stable model names, and GoModel can serve repeated non-streaming requests from the exact-match cache after request planning. That means alias changes and workflow decisions are still reflected in cache behavior.

Can I track usage per team or app?

Yes. Use X-GoModel-User-Path to tag requests by team, app, tenant, or feature. GoModel carries that through usage and audit data so you can filter, attribute cost, and inspect behavior at the right operational boundary.

How do I deploy and operate GoModel?

The fastest path is Docker, but GoModel also runs as a single binary or in Docker Compose and Kubernetes. For operations, the built-in admin dashboard and APIs expose models, usage, cache, audit logs, aliases, and workflows.

AI Gateway forTracking, Decoupling, Debuggingyour AI Usage