Skip to content

Commit 3a81ada

Browse files
Refactor: architecture refactoring - ADR 2 implemented (#133)
* Introduce ingress frame foundation and preserve opaque fields * Use ingress-backed decoding in JSON handlers * Preserve opaque fields in guarded batch rewrites * Cap ingress body capture and reuse it in audit logging * Preserve nested opaque fields in normal JSON flows * Add transport-first provider passthrough routes * Add configurable passthrough v1 prefix normalization * Add configurable passthrough route toggle * Normalize passthrough v1 aliases for all providers * Make semantic envelopes authoritative for JSON handlers * Move batch requests onto ingress-backed semantics * Harden passthrough and file ingress rollout * Add sparse file semantics to semantic envelopes * Preserve guarded chat envelopes during rewrites * Move more batch semantics out of handlers * Deduplicate sparse route semantic builders * Share batch item semantics across providers and guardrails * Move semantic caching into core and preserve adapter extras * Use semantic decoding in model validation * Move selector normalization into semantic core * Collapse semantic caches behind typed accessors * Trim semantic wrapper and batch decode boilerplate * Clarify ADR-0002 scope and opaque preservation * Move more semantic decoding and file enrichment into core * Collapse downstream batch operation switches * Unify canonical semantic codecs in core * Refactor duplicated routing and semantic helpers * Fix oversized validation and passthrough alias handling * Share raw JSON helpers and tighten passthrough auth * Deduplicate passthrough and request lifecycle helpers * Refresh replay goldens and fix tools build tag * Fix review findings for ingress and passthrough handling * Fix verified review follow-ups for passthrough and ingress * Finalize ingress encapsulation and refresh README * Fix verified follow-up review comments * Fix request-id propagation and guardrails model-count fallback
1 parent fc4ad3c commit 3a81ada

86 files changed

Lines changed: 12305 additions & 1851 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.env.template

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,12 @@
55
# Accepts values like "10M", "1G", "500K" (default: 10M)
66
# BODY_SIZE_LIMIT=10M
77

8+
# Enable/disable provider-native passthrough routes under /p/{provider}/{endpoint} (default: true)
9+
# ENABLE_PROVIDER_PASSTHROUGH=true
10+
11+
# Allow optional /p/{provider}/v1/... passthrough aliases while keeping /p/{provider}/... canonical (default: true)
12+
# NORMALIZE_PASSTHROUGH_V1_PREFIX=true
13+
814
# HTTP Client Configuration (for upstream API requests)
915
# Values in seconds (or Go duration format like "10m", "1h30m")
1016
# Overall request timeout (default: 600 = 10 minutes, matches OpenAI/Anthropic SDKs)

.golangci.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ linters:
44
- dupl
55
settings:
66
dupl:
7-
threshold: 150
7+
threshold: 100
88
exclusions:
99
rules:
1010
- linters:

README.md

Lines changed: 46 additions & 78 deletions
Original file line numberDiff line numberDiff line change
@@ -46,65 +46,18 @@ curl http://localhost:8080/v1/chat/completions \
4646

4747
### Supported Providers
4848

49-
Example model identifiers are illustrative and subject to change; consult provider catalogs for current models.
50-
51-
<table>
52-
<tr>
53-
<th colspan="3">Provider</th>
54-
<th colspan="8">Features</th>
55-
</tr>
56-
<tr>
57-
<th>Name</th>
58-
<th>Credential</th>
59-
<th>Example&nbsp;Model</th>
60-
<th>Chat</th>
61-
<th>Passthru</th>
62-
<th>Voice</th>
63-
<th>Image</th>
64-
<th>Video</th>
65-
<th>/responses</th>
66-
<th>Embed</th>
67-
<th>Cache</th>
68-
</tr>
69-
<tr>
70-
<td>OpenAI</td>
71-
<td><code>OPENAI_API_KEY</code></td>
72-
<td><code>gpt&#8209;4o&#8209;mini</code></td>
73-
<td>✅</td><td>🚧</td><td>🚧</td><td>🚧</td><td>🚧</td><td>🚧</td><td>✅</td><td>🚧</td>
74-
</tr>
75-
<tr>
76-
<td>Anthropic</td>
77-
<td><code>ANTHROPIC_API_KEY</code></td>
78-
<td><code>claude&#8209;sonnet&#8209;4&#8209;20250514</code></td>
79-
<td>✅</td><td>🚧</td><td>🚧</td><td>🚧</td><td>🚧</td><td>🚧</td><td>❌</td><td>🚧</td>
80-
</tr>
81-
<tr>
82-
<td>Google&nbsp;Gemini</td>
83-
<td><code>GEMINI_API_KEY</code></td>
84-
<td><code>gemini&#8209;2.5&#8209;flash</code></td>
85-
<td>✅</td><td>🚧</td><td>🚧</td><td>🚧</td><td>🚧</td><td>🚧</td><td>✅</td><td>🚧</td>
86-
</tr>
87-
<tr>
88-
<td>Groq</td>
89-
<td><code>GROQ_API_KEY</code></td>
90-
<td><code>llama&#8209;3.3&#8209;70b&#8209;versatile</code></td>
91-
<td>✅</td><td>🚧</td><td>🚧</td><td>🚧</td><td>🚧</td><td>🚧</td><td>✅</td><td>🚧</td>
92-
</tr>
93-
<tr>
94-
<td>xAI&nbsp;(Grok)</td>
95-
<td><code>XAI_API_KEY</code></td>
96-
<td><code>grok&#8209;2</code></td>
97-
<td>✅</td><td>🚧</td><td>🚧</td><td>🚧</td><td>🚧</td><td>🚧</td><td>✅</td><td>🚧</td>
98-
</tr>
99-
<tr>
100-
<td>Ollama</td>
101-
<td><code>OLLAMA_BASE_URL</code></td>
102-
<td><code>llama3.2</code></td>
103-
<td>✅</td><td>🚧</td><td>🚧</td><td>🚧</td><td>🚧</td><td>🚧</td><td>✅</td><td>🚧</td>
104-
</tr>
105-
</table>
106-
107-
✅ Supported 🚧 Coming soon ❌ Unsupported
49+
Example model identifiers are illustrative and subject to change; consult provider catalogs for current models. Feature columns reflect gateway API support, not every individual model capability exposed by an upstream provider.
50+
51+
| Provider | Credential | Example Model | Chat | `/responses` | Embed | Files | Batches | Passthru |
52+
|----------|------------|---------------|:----:|:------------:|:-----:|:-----:|:-------:|:--------:|
53+
| OpenAI | `OPENAI_API_KEY` | `gpt-4o-mini` |||||||
54+
| Anthropic | `ANTHROPIC_API_KEY` | `claude-sonnet-4-20250514` |||||||
55+
| Google Gemini | `GEMINI_API_KEY` | `gemini-2.5-flash` |||||||
56+
| Groq | `GROQ_API_KEY` | `llama-3.3-70b-versatile` |||||||
57+
| xAI (Grok) | `XAI_API_KEY` | `grok-2` |||||||
58+
| Ollama | `OLLAMA_BASE_URL` | `llama3.2` |||||||
59+
60+
✅ Supported ❌ Unsupported
10861

10962
---
11063

@@ -170,30 +123,39 @@ docker run --rm -p 8080:8080 --env-file .env gomodel
170123
| `/v1/batches/{id}` | GET | Retrieve one stored batch |
171124
| `/v1/batches/{id}/cancel` | POST | Cancel a pending batch |
172125
| `/v1/batches/{id}/results` | GET | Retrieve native batch results when available |
126+
| `/p/{provider}/...` | GET, POST, PUT, PATCH, DELETE, HEAD, OPTIONS | Provider-native passthrough with opaque upstream responses |
173127
| `/v1/models` | GET | List available models |
174128
| `/health` | GET | Health check |
175129
| `/metrics` | GET | Prometheus metrics (when enabled) |
176130
| `/admin/api/v1/usage/summary` | GET | Aggregate token usage statistics |
177131
| `/admin/api/v1/usage/daily` | GET | Per-period token usage breakdown |
132+
| `/admin/api/v1/usage/models` | GET | Usage breakdown by model |
133+
| `/admin/api/v1/usage/log` | GET | Paginated usage log entries |
134+
| `/admin/api/v1/audit/log` | GET | Paginated audit log entries |
135+
| `/admin/api/v1/audit/conversation` | GET | Conversation thread around one audit log entry |
178136
| `/admin/api/v1/models` | GET | List models with provider type |
137+
| `/admin/api/v1/models/categories` | GET | List model categories |
179138
| `/admin/dashboard` | GET | Admin dashboard UI |
139+
| `/swagger/index.html` | GET | Swagger UI (when enabled) |
180140

181141
---
182142

183143
## Configuration
184144

185-
GOModel is configured through environment variables. See [`.env.template`](.env.template) for all options.
145+
GOModel is configured through environment variables and an optional `config.yaml`. Environment variables override YAML values. See [`.env.template`](.env.template) and [`config/config.example.yaml`](config/config.example.yaml) for the available options.
186146

187147
Key settings:
188148

189149
| Variable | Default | Description |
190150
|----------|---------|-------------|
191151
| `PORT` | `8080` | Server port |
192152
| `GOMODEL_MASTER_KEY` | (none) | API key for authentication |
153+
| `ENABLE_PROVIDER_PASSTHROUGH` | `true` | Enable provider-native passthrough routes under `/p/{provider}/...` |
193154
| `CACHE_TYPE` | `local` | Cache backend (`local` or `redis`) |
194155
| `STORAGE_TYPE` | `sqlite` | Storage backend (`sqlite`, `postgresql`, `mongodb`) |
195156
| `METRICS_ENABLED` | `false` | Enable Prometheus metrics |
196157
| `LOGGING_ENABLED` | `false` | Enable audit logging |
158+
| `GUARDRAILS_ENABLED` | `false` | Enable the configured guardrails pipeline |
197159

198160
**Quick Start — Authentication:** By default `GOMODEL_MASTER_KEY` is unset. Without this key, API endpoints are unprotected and anyone can call them. This is insecure for production. **Strongly recommend** setting a strong secret before exposing the service. Add `GOMODEL_MASTER_KEY` to your `.env` or environment for production deployments.
199161

@@ -205,28 +167,34 @@ See [DEVELOPMENT.md](DEVELOPMENT.md) for testing, linting, and pre-commit setup.
205167

206168
# Roadmap
207169

208-
## Features
170+
## Shipped
171+
172+
| Area | Status | Notes |
173+
| ---- | :----: | ----- |
174+
| OpenAI-compatible API surface || `/v1/chat/completions`, `/v1/responses`, `/v1/embeddings`, `/v1/files*`, `/v1/batches*`, and `/v1/models` are implemented. |
175+
| Provider passthrough || Provider-native passthrough routes are available under `/p/{provider}/...`. |
176+
| Observability || Prometheus metrics, audit logging, usage tracking, request IDs, and trace-header capture are implemented. |
177+
| Administrative endpoints || Admin API and dashboard ship with usage, audit, and model views. |
178+
| Guardrails || The guardrails pipeline is implemented and can be enabled from config. |
179+
| System prompt guardrails || `inject`, `override`, and `decorator` modes are supported. |
180+
181+
## In Progress
209182

210-
| Feature | Basic | Full |
211-
| -------------------------- |:-----:|:----:|
212-
| Billing Management | 🚧 | 🚧 |
213-
| Full-observability | 🚧 | 🚧 |
214-
| Budget management | 🚧 | 🚧 |
215-
| Many keys support | 🚧 | 🚧 |
216-
| Administrative endpoints || 🚧 |
217-
| Guardrails || 🚧 |
218-
| SSO | 🚧 | 🚧 |
219-
| System Prompt (GuardRails) || 🚧 |
183+
| Area | Status | Notes |
184+
| ---- | :----: | ----- |
185+
| Billing management | 🚧 | Usage and pricing primitives exist, but billing workflows are not complete. |
186+
| Budget management | 🚧 | Gateway-level budget enforcement and policy controls are not implemented yet. |
187+
| Guardrails depth | 🚧 | The system prompt guardrail is available today; broader guardrail types are still to come. |
188+
| Observability integrations | 🚧 | Native Prometheus support exists; OpenTelemetry and DataDog integrations are still pending. |
220189

221-
## Integrations
190+
## Planned
222191

223-
| Integration | Basic | Full |
224-
| ------------- |:-----:|:----:|
225-
| Prometheus || 🚧 |
226-
| DataDog | 🚧 | 🚧 |
227-
| OpenTelemetry | 🚧 | 🚧 |
192+
| Area | Status | Notes |
193+
| ---- | :----: | ----- |
194+
| Many keys support | 🚧 | The gateway still uses one configured credential/base URL per provider. |
195+
| SSO / OIDC | 🚧 | No SSO implementation is present yet. |
228196

229-
Supported 🚧 Coming soon
197+
Shipped 🚧 Planned or in progress
230198

231199
## Star History
232200

0 commit comments

Comments
 (0)