You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Only fields that are explicitly listed under a provider's `resilience:` block are overridden. Everything else silently inherits from the global section.
124
126
@@ -161,33 +163,37 @@ GROQ_API_KEY=gsk_...
161
163
162
164
All resilience settings can be overridden at runtime via env vars. Env vars always beat both code defaults and YAML values.
163
165
164
-
| Variable | Type | Default | Description |
165
-
|---|---|---|---|
166
-
| `RETRY_MAX_RETRIES` | int | `3` | Maximum retry attempts per request |
| `RETRY_JITTER_FACTOR` | float | `0.1` | Random jitter as a fraction of the backoff |
171
-
| `CIRCUIT_BREAKER_FAILURE_THRESHOLD` | int | `5` | Consecutive failures before opening |
172
-
| `CIRCUIT_BREAKER_SUCCESS_THRESHOLD` | int | `2` | Consecutive successes to close again |
173
-
| `CIRCUIT_BREAKER_TIMEOUT` | duration | `30s` | How long the circuit stays open |
174
-
| `LOG_FORMAT` | string | _(unset)_ | Auto-detects based on environment: colorized text on a TTY, JSON otherwise. Set to `text` to force human-readable output (no colors if not a TTY), or `json` to force structured JSON even on a TTY (recommended for production, CloudWatch, Datadog, GCP). |
| `RETRY_JITTER_FACTOR` | float | `0.1` | Random jitter as a fraction of the backoff |
174
+
| `CIRCUIT_BREAKER_FAILURE_THRESHOLD` | int | `5` | Consecutive failures before opening |
175
+
| `CIRCUIT_BREAKER_SUCCESS_THRESHOLD` | int | `2` | Consecutive successes to close again |
176
+
| `CIRCUIT_BREAKER_TIMEOUT` | duration | `30s` | How long the circuit stays open |
177
+
| `LOG_FORMAT` | string | *(unset)* | Auto-detects based on environment: colorized text on a TTY, JSON otherwise. Set to `text` to force human-readable output (no colors if not a TTY), or `json` to force structured JSON even on a TTY (recommended for production, CloudWatch, Datadog, GCP). |
console.log(embedding.data[0].embedding.slice(0, 5)); // first 5 dimensions
513
571
```
514
572
515
573
---
@@ -554,13 +612,10 @@ for await (const chunk of stream) {
554
612
## Tips
555
613
556
614
1. **Model routing**: The gateway automatically routes requests to the correct provider based on the model name — no configuration needed. Just use any model name from the list above.
557
-
558
615
2. **API compatibility**: The gateway exposes an OpenAI-compatible API. Existing OpenAI client libraries work unchanged for all providers.
559
-
560
616
3. **Streaming**: All providers support streaming. The gateway normalises provider-specific formats to OpenAI's SSE format.
561
-
562
617
4. **System messages**: Anthropic's system message format is handled automatically. Gemini uses Google's OpenAI-compatible endpoint, which also handles system messages natively.
563
-
564
618
5. **Max tokens**: Anthropic requires `max_tokens` to be set. If not provided, the gateway defaults to 4096. OpenAI and Gemini treat it as optional.
565
-
566
619
6. **Responses API**: The `/v1/responses` endpoint provides a unified interface across all providers. Providers that do not natively support the Responses API convert requests internally.
620
+
7. **Embeddings**: The `/v1/embeddings` endpoint is supported by OpenAI, Gemini, Groq, xAI, and Ollama. Anthropic does not offer embeddings natively.
0 commit comments