ai-menshen (门神) is a lightweight, local-first AI Gateway. It proxies any OpenAI-compatible API, providing Auth Injection (BYOK), Model Overriding, Failover, Usage Auditing, and Response Caching—all while keeping your API keys and logs strictly under your control.
Single Go binary. Zero external dependencies besides SQLite.
graph TD
subgraph EXTERNAL [External World]
direction LR
Client([Clients / OpenClaw 🦞])
Upstream(["Upstream API<br>(OpenAI/DeepSeek...)"])
end
subgraph LOCAL [Local Environment]
direction TB
G["ai-menshen<br>(Standalone Go binary)"]
DB[(SQLite)]
CFG[config.toml]
G --- DB
G --- CFG
end
Client ==>|OpenAI API| G
G ==>|Auth Injection| Upstream
style EXTERNAL fill:none,stroke:#ccc,stroke-dasharray: 5 5
style LOCAL fill:#f0f7ff,stroke:#0066cc,stroke-width:2px
style G fill:#0066cc,color:#fff
style DB fill:#fff,stroke:#ddd
style CFG fill:#fff,stroke:#ddd
ai-menshen ships with a lightweight, built-in dashboard at http://localhost:8080/.
Zero external CDN calls—all JS/CSS is embedded, making it ideal for offline or air-gapped environments.
| Overview & Trends | Audit Logs |
|---|---|
![]() |
![]() |
Download and install the pre-built binary from GitHub Releases.
curl -fsSL https://raw.githubusercontent.com/jiacai2050/ai-menshen/main/install.sh | shThe script supports custom version and installation directory:
# Pass arguments using sh -s --
curl -fsSL https://raw.githubusercontent.com/jiacai2050/ai-menshen/main/install.sh | sh -s -- --version v1.0.0 --prefix /usr/local/bin| Option | Description | Default |
|---|---|---|
--version, -v |
Release version to install | latest |
--prefix, -p |
Directory to install binary | ~/.local/bin |
--china |
Use mirror for downloads (for users in China) | false |
go install github.com/jiacai2050/ai-menshen@latestgit clone https://github.com/jiacai2050/ai-menshen.git
cd ai-menshen
make build
# The binary 'ai-menshen' will be available in the current directory-
Install binary (choose one):
- One-liner (Linux & macOS)
go install github.com/jiacai2050/ai-menshen@latest- From Source
-
Setup config:
mkdir -p ~/.config/ai-menshen # Generate the default config ai-menshen -gen-config > ~/.config/ai-menshen/config.toml # Edit with your upstream API key vi ~/.config/ai-menshen/config.toml
-
Run:
ai-menshen -config ~/.config/ai-menshen/config.toml -
Connect: Point your OpenAI client to
http://localhost:8080.REST API:
# `your-auth-token` should match [auth].token if auth.enable = true (can be anything if false) curl http://localhost:8080/chat/completions \ -H "Content-Type: application/json" \ -H "Authorization: Bearer your-auth-token" \ -d '{ "model": "gpt-4o", "messages": [{"role": "user", "content": "Hello!"}] }'
Python SDK:
from openai import OpenAI client = OpenAI( base_url="http://localhost:8080", api_key="your-auth-token" # Match [auth].token if auth.enable = true (can be anything if false) ) # Automatic usage auditing (even for streaming!) response = client.chat.completions.create( model="gpt-4o", messages=[{"role": "user", "content": "Hello!"}], stream=True ) for chunk in response: print(chunk.choices[0].delta.content or "", end="")
A launchd plist is provided at configs/net.liujiacai.ai-menshen.plist.
# Install the service
cp configs/net.liujiacai.ai-menshen.plist ~/Library/LaunchAgents/
# Load and start
launchctl load ~/Library/LaunchAgents/net.liujiacai.ai-menshen.plist
# Stop and unload
launchctl unload ~/Library/LaunchAgents/net.liujiacai.ai-menshen.plist
# Check status
launchctl list | grep ai-menshen
# View logs
tail -f /tmp/ai-menshen-stderr.logThe service starts automatically on login and restarts on crash. It expects the binary at ~/.local/bin/ai-menshen and config at ~/.config/ai-menshen/config.toml.
Customize config.toml (template: configs/example.toml). api_key, password, token, headers and storage.sqlite.path values support Environment Variables (e.g., ${KEY}).
| Section | Field | Description | Default |
|---|---|---|---|
| Global | listen |
Local bind address | :8080 |
| Auth | enable |
Enable authentication for gateway & dashboard | false |
user |
Username for Dashboard (Basic Auth) | - | |
password |
Password for Dashboard (Basic Auth) | - | |
token |
Token for API Requests (Bearer Auth) | - | |
| Providers | base_url |
Upstream endpoint (Required) | - |
api_key |
Upstream key | - | |
headers |
Custom headers (e.g., { "cf-aig-authorization" = "Bearer..." }) |
{} |
|
model |
Force override request model | - | |
weight |
Weighted request share (0 disables the provider; startup fails if all are 0) |
0 |
|
| Upstream | timeout |
Upstream request timeout (seconds) | 300 (5 min) |
| Storage | retention_days |
Automatically purge logs older than X days | 90 |
| Storage.SQLite | path |
SQLite database location | ./data/ai-menshen.db |
| Failover | enable |
Auto-retry with next provider on failure | true |
| Cache | enable |
Cache 200 responses | true |
max_body_bytes |
Skip caching responses larger than this size (0 = no limit) | 5242880 (5 MiB) |
|
max_age |
Cache TTL in seconds (0 = never expire) | 0 |
|
| Logging | log_request_body |
Persist full request body in DB | true |
log_response_body |
Persist full response body in DB (required for cache) | true |
Use providers[].weight to control how often each upstream is selected.
weight = 0disables that provider.- Higher weights receive a larger share of requests.
- At least one provider must have
weight > 0, or startup fails.
Example:
[[providers]]
base_url = "https://api.openai.com/v1"
api_key = "${OPENAI_API_KEY}"
weight = 8
[[providers]]
base_url = "https://api.deepseek.com"
api_key = "${DEEPSEEK_API_KEY}"
weight = 2In this example, requests will trend toward an 80% / 20% split over time. The selection is probabilistic per request, so short runs may vary.
When failover.enable = true (the default), a failed request is automatically retried against the remaining providers:
- Triggers: network errors, HTTP 5xx, or 429 (Too Many Requests).
- Order: the first provider is chosen by weighted random; on failure, the remaining providers are tried in config order.
- Streaming: failover only happens before any data has been sent to the client. Once SSE chunks are flowing, the stream is not retried.
- Passthrough: non-auditable paths (e.g.,
/models) use a single provider and do not failover. - Graceful degradation: if every provider fails, the last upstream response (including its original status code and body) is passed through to the client rather than being replaced with a generic 502.
No extra configuration is needed—just define multiple [[providers]] blocks and failover works automatically.

