ai-menshen

ai-menshen (门神) is a lightweight, local-first AI Gateway. It proxies any OpenAI-compatible API, providing Auth Injection (BYOK), Model Overriding, Failover, Usage Auditing, and Response Caching—all while keeping your API keys and logs strictly under your control.

Single Go binary. Zero external dependencies besides SQLite.

graph TD
    subgraph EXTERNAL [External World]
        direction LR
        Client([Clients / OpenClaw 🦞])
        Upstream(["Upstream API<br>(OpenAI/DeepSeek...)"])
    end

    subgraph LOCAL [Local Environment]
        direction TB
        G["ai-menshen<br>(Standalone Go binary)"]

        DB[(SQLite)]
        CFG[config.toml]

        G --- DB
        G --- CFG
    end

    Client ==>|OpenAI API| G
    G ==>|Auth Injection| Upstream

    style EXTERNAL fill:none,stroke:#ccc,stroke-dasharray: 5 5
    style LOCAL fill:#f0f7ff,stroke:#0066cc,stroke-width:2px
    style G fill:#0066cc,color:#fff
    style DB fill:#fff,stroke:#ddd
    style CFG fill:#fff,stroke:#ddd

Built-in Dashboard

ai-menshen ships with a lightweight, built-in dashboard at http://localhost:8080/.

Zero external CDN calls—all JS/CSS is embedded, making it ideal for offline or air-gapped environments.

Overview & Trends	Audit Logs

Installation

One-liner (Linux & macOS)

Download and install the pre-built binary from GitHub Releases.

curl -fsSL https://raw.githubusercontent.com/jiacai2050/ai-menshen/main/install.sh | sh

The script supports custom version and installation directory:

# Pass arguments using sh -s --
curl -fsSL https://raw.githubusercontent.com/jiacai2050/ai-menshen/main/install.sh | sh -s -- --version v1.0.0 --prefix /usr/local/bin

Option	Description	Default
`--version`, `-v`	Release version to install	`latest`
`--prefix`, `-p`	Directory to install binary	`~/.local/bin`
`--china`	Use mirror for downloads (for users in China)	`false`

Via Go Install

go install github.com/jiacai2050/ai-menshen@latest

From Source

git clone https://github.com/jiacai2050/ai-menshen.git
cd ai-menshen
make build
# The binary 'ai-menshen' will be available in the current directory

Quick Start

Install binary (choose one):
- One-liner (Linux & macOS)
- go install github.com/jiacai2050/ai-menshen@latest
- From Source

Setup config:

mkdir -p ~/.config/ai-menshen
# Generate the default config
ai-menshen -gen-config > ~/.config/ai-menshen/config.toml
# Edit with your upstream API key
vi ~/.config/ai-menshen/config.toml

Run:

ai-menshen -config ~/.config/ai-menshen/config.toml

Connect: Point your OpenAI client to http://localhost:8080.

REST API:

# `your-auth-token` should match [auth].token if auth.enable = true (can be anything if false)
curl http://localhost:8080/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer your-auth-token" \
  -d '{
    "model": "gpt-4o",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

Python SDK:

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:8080",
    api_key="your-auth-token" # Match [auth].token if auth.enable = true (can be anything if false)
)

# Automatic usage auditing (even for streaming!)
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello!"}],
    stream=True
)
for chunk in response:
    print(chunk.choices[0].delta.content or "", end="")

Run as Background Service (macOS)

A launchd plist is provided at configs/net.liujiacai.ai-menshen.plist.

# Install the service
cp configs/net.liujiacai.ai-menshen.plist ~/Library/LaunchAgents/

# Load and start
launchctl load ~/Library/LaunchAgents/net.liujiacai.ai-menshen.plist

# Stop and unload
launchctl unload ~/Library/LaunchAgents/net.liujiacai.ai-menshen.plist

# Check status
launchctl list | grep ai-menshen

# View logs
tail -f /tmp/ai-menshen-stderr.log

The service starts automatically on login and restarts on crash. It expects the binary at ~/.local/bin/ai-menshen and config at ~/.config/ai-menshen/config.toml.

Configuration Guide

Customize config.toml (template: configs/example.toml). api_key, password, token, headers and storage.sqlite.path values support Environment Variables (e.g., ${KEY}).

Section	Field	Description	Default
Global	`listen`	Local bind address	`:8080`
Auth	`enable`	Enable authentication for gateway & dashboard	`false`
	`user`	Username for Dashboard (Basic Auth)	-
	`password`	Password for Dashboard (Basic Auth)	-
	`token`	Token for API Requests (Bearer Auth)	-
Providers	`base_url`	Upstream endpoint (Required)	-
	`api_key`	Upstream key	-
	`headers`	Custom headers (e.g., `{ "cf-aig-authorization" = "Bearer..." }`)	`{}`
	`model`	Force override request model	-
	`weight`	Weighted request share (`0` disables the provider; startup fails if all are `0`)	`0`
Upstream	`timeout`	Upstream request timeout (seconds)	`300` (5 min)
Storage	`retention_days`	Automatically purge logs older than X days	`90`
Storage.SQLite	`path`	SQLite database location	`./data/ai-menshen.db`
Failover	`enable`	Auto-retry with next provider on failure	`true`
Cache	`enable`	Cache 200 responses	`true`
	`max_body_bytes`	Skip caching responses larger than this size (0 = no limit)	`5242880` (5 MiB)
	`max_age`	Cache TTL in seconds (0 = never expire)	`0`
Logging	`log_request_body`	Persist full request body in DB	`true`
	`log_response_body`	Persist full response body in DB (required for cache)	`true`

Weighted Providers

Use providers[].weight to control how often each upstream is selected.

weight = 0 disables that provider.
Higher weights receive a larger share of requests.
At least one provider must have weight > 0, or startup fails.

Example:

[[providers]]
base_url = "https://api.openai.com/v1"
api_key = "${OPENAI_API_KEY}"
weight = 8

[[providers]]
base_url = "https://api.deepseek.com"
api_key = "${DEEPSEEK_API_KEY}"
weight = 2

In this example, requests will trend toward an 80% / 20% split over time. The selection is probabilistic per request, so short runs may vary.

Failover

When failover.enable = true (the default), a failed request is automatically retried against the remaining providers:

Triggers: network errors, HTTP 5xx, or 429 (Too Many Requests).
Order: the first provider is chosen by weighted random; on failure, the remaining providers are tried in config order.
Streaming: failover only happens before any data has been sent to the client. Once SSE chunks are flowing, the stream is not retried.
Passthrough: non-auditable paths (e.g., /models) use a single provider and do not failover.
Graceful degradation: if every provider fails, the last upstream response (including its original status code and body) is passed through to the client rather than being replaced with a generic 502.

No extra configuration is needed—just define multiple [[providers]] blocks and failover works automatically.

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
.github/workflows		.github/workflows
configs		configs
docs		docs
internal		internal
.gitignore		.gitignore
.goreleaser.yaml		.goreleaser.yaml
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
GEMINI.md		GEMINI.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
go.mod		go.mod
go.sum		go.sum
install.sh		install.sh
main.go		main.go
mock.sql		mock.sql

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ai-menshen

Built-in Dashboard

Installation

One-liner (Linux & macOS)

Via Go Install

From Source

Quick Start

Run as Background Service (macOS)

Configuration Guide

Weighted Providers

Failover

About

Uh oh!

Releases 4

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ai-menshen

Built-in Dashboard

Installation

One-liner (Linux & macOS)

Via Go Install

From Source

Quick Start

Run as Background Service (macOS)

Configuration Guide

Weighted Providers

Failover

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 4

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages