You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
docs(providers): rewrite Cerebras, Groq, and SGLang with code-verified setup
Cerebras (docs/providers/cerebras.md): rewrote against
extensions/cerebras/openclaw.plugin.json. Added a complete properties
summary, CodeGroup for onboarding/direct-flag/env, a Reasoning column on
the four-model catalog table (Z.ai GLM 4.7 and GPT OSS 120B are
reasoning-capable; Qwen 3 235B and Llama 3.1 8B are not), and a
CardGroup of related links.
Groq (docs/providers/groq.md): expanded the catalog from 4 hand-picked
entries to all 18 bundled models from extensions/groq/openclaw.plugin.json
with model refs, reasoning flags, input modalities, and context windows.
Removed a stale 'Mixtral 8x7B' row that does not exist in the bundled
catalog. Surfaced the audio media-understanding contract (whisper-large-v3-turbo,
auto priority 20) as a properties table and explained the per-model
reasoning_effort mapping for qwen/qwen3-32b vs the GPT OSS reasoning
models. Added an onboarding CodeGroup so the API-key step does not skip
'openclaw onboard --auth-choice groq-api-key'.
SGLang (docs/providers/sglang.md): added a properties summary table at
the top, including the Qwen/Qwen3-8B model placeholder from
extensions/sglang/defaults.ts, the supportsStreamingUsage runtime flag,
and the modelPricing.external: false setting. Clarified that the
onboarding choice id is bare 'sglang' (custom method) rather than the
'-api-key' suffix used by other providers, matching the manifest.
[Cerebras](https://www.cerebras.ai) provides high-speed OpenAI-compatible inference on custom inference hardware. OpenClaw includes a bundled Cerebras provider plugin with a static four-model catalog.
| Onboarding flag |`--auth-choice cerebras-api-key`|
17
+
| Direct CLI flag |`--cerebras-api-key <key>`|
18
+
| API | OpenAI-compatible (`openai-completions`) |
19
+
| Base URL |`https://api.cerebras.ai/v1`|
20
+
| Default model |`cerebras/zai-glm-4.7`|
17
21
18
-
## Getting Started
22
+
## Getting started
19
23
20
24
<Steps>
21
25
<Steptitle="Get an API key">
22
26
Create an API key in the [Cerebras Cloud Console](https://cloud.cerebras.ai).
23
27
</Step>
24
28
<Steptitle="Run onboarding">
25
-
```bash
26
-
openclaw onboard --auth-choice cerebras-api-key
27
-
```
29
+
<CodeGroup>
30
+
31
+
```bash Onboarding
32
+
openclaw onboard --auth-choice cerebras-api-key
33
+
```
34
+
35
+
```bash Direct flag
36
+
openclaw onboard --non-interactive \
37
+
--auth-choice cerebras-api-key \
38
+
--cerebras-api-key "$CEREBRAS_API_KEY"
39
+
```
40
+
41
+
```bash Env only
42
+
export CEREBRAS_API_KEY=csk-...
43
+
```
44
+
45
+
</CodeGroup>
46
+
28
47
</Step>
29
48
<Steptitle="Verify models are available">
30
49
```bash
31
50
openclaw models list --provider cerebras
32
51
```
52
+
53
+
The list should include all four bundled models. If `CEREBRAS_API_KEY` is unresolved, `openclaw models status --json` reports the missing credential under `auth.unusableProfiles`.
OpenClaw ships a static Cerebras catalog for the public OpenAI-compatible endpoint:
69
+
OpenClaw ships a static Cerebras catalog that mirrors the public OpenAI-compatible endpoint. All four models share a 128k context and 8,192 max-output tokens.
|`cerebras/gpt-oss-120b`| GPT OSS 120B |yes |Production reasoning model |
75
+
|`cerebras/qwen-3-235b-a22b-instruct-2507`| Qwen 3 235B Instruct |no |Preview non-reasoning model |
76
+
|`cerebras/llama3.1-8b`| Llama 3.1 8B |no |Production speed-focused model |
55
77
56
78
<Warning>
57
-
Cerebras marks `zai-glm-4.7` and `qwen-3-235b-a22b-instruct-2507` as preview models, and `llama3.1-8b`/`qwen-3-235b-a22b-instruct-2507` are documented for deprecation on May 27, 2026. Check Cerebras' supported-models page before relying on them for production.
79
+
Cerebras marks `zai-glm-4.7` and `qwen-3-235b-a22b-instruct-2507` as preview models, and `llama3.1-8b`plus`qwen-3-235b-a22b-instruct-2507` are documented for deprecation on May 27, 2026. Check Cerebras' supported-models page before relying on them for production workloads.
58
80
</Warning>
59
81
60
-
## Manual Config
82
+
## Manual config
61
83
62
-
The bundled plugin usually means you only need the API key. Use explicit
63
-
`models.providers.cerebras` config when you want to override model metadata:
84
+
The bundled plugin usually means you only need the API key. Use explicit `models.providers.cerebras` config when you want to override model metadata or run in `mode: "merge"` against the static catalog:
64
85
65
86
```json5
66
87
{
67
-
env: { CEREBRAS_API_KEY:"sk-..." },
88
+
env: { CEREBRAS_API_KEY:"csk-..." },
68
89
agents: {
69
90
defaults: {
70
91
model: { primary:"cerebras/zai-glm-4.7" },
@@ -88,7 +109,22 @@ The bundled plugin usually means you only need the API key. Use explicit
88
109
```
89
110
90
111
<Note>
91
-
If the Gateway runs as a daemon (launchd/systemd), make sure `CEREBRAS_API_KEY`
92
-
is available to that process, for example in `~/.openclaw/.env` or through
93
-
`env.shellEnv`.
112
+
If the Gateway runs as a daemon (launchd, systemd, Docker), make sure `CEREBRAS_API_KEY` is available to that process — for example in `~/.openclaw/.env` or through `env.shellEnv`. A key sitting only in `~/.profile` will not help a managed service unless the env is imported separately.
summary: "Groq setup (auth + model selection + Whisper transcription)"
3
3
title: "Groq"
4
4
read_when:
5
5
- You want to use Groq with OpenClaw
6
6
- You need the API key env var or CLI auth choice
7
+
- You are configuring Whisper audio transcription on Groq
7
8
---
8
9
9
-
[Groq](https://groq.com) provides ultra-fast inference on open-source models
10
-
(Llama, Gemma, Mistral, and more) using custom LPU hardware. OpenClaw connects
11
-
to Groq through its OpenAI-compatible API.
10
+
[Groq](https://groq.com) provides ultra-fast inference on open-weight models (Llama, Gemma, Kimi, Qwen, GPT OSS, and more) using custom LPU hardware. OpenClaw includes a bundled Groq plugin that registers both an OpenAI-compatible chat provider and an audio media-understanding provider.
OpenClaw ships a manifest-backed Groq catalog with both reasoning and non-reasoning entries. Run `openclaw models list --provider groq` to see the bundled rows for your installed version, or check [console.groq.com/docs/models](https://console.groq.com/docs/models) for Groq's authoritative list.
77
+
78
+
| Model ref | Name | Reasoning | Input | Context |
|`groq/groq/compound`| Compound | yes | text | 131,072 |
97
+
|`groq/groq/compound-mini`| Compound Mini | yes | text | 131,072 |
69
98
70
99
<Tip>
71
-
Use `openclaw models list --all --provider groq` for the manifest-backed Groq
72
-
rows known to this OpenClaw version.
100
+
The catalog evolves with each OpenClaw release. `openclaw models list --provider groq` shows the rows known to your installed version; cross-check with [console.groq.com/docs/models](https://console.groq.com/docs/models) for newly-added or deprecated models.
73
101
</Tip>
74
102
75
103
## Reasoning models
76
104
77
-
OpenClaw maps its shared `/think` levels to Groq's model-specific
78
-
`reasoning_effort` values. For `qwen/qwen3-32b`, disabled thinking sends
79
-
`none` and enabled thinking sends `default`. For Groq GPT-OSS reasoning models,
80
-
OpenClaw sends `low`, `medium`, or `high`; disabled thinking omits
81
-
`reasoning_effort` because those models do not support a disabled value.
105
+
OpenClaw maps its shared `/think` levels to Groq's model-specific `reasoning_effort` values:
106
+
107
+
- For `qwen/qwen3-32b`, disabled thinking sends `none` and enabled thinking sends `default`.
108
+
- For Groq GPT OSS reasoning models (`openai/gpt-oss-*`), OpenClaw sends `low`, `medium`, or `high` based on `/think` level. Disabled thinking omits `reasoning_effort` because those models do not support a disabled value.
109
+
- DeepSeek R1 Distill, Qwen QwQ, and Compound use Groq's native reasoning surface; `/think` controls visibility but the model always reasons.
110
+
111
+
See [Thinking modes](/tools/thinking) for the shared `/think` levels and how OpenClaw translates them per provider.
82
112
83
113
## Audio transcription
84
114
85
-
Groq also provides fast Whisper-based audio transcription. When configured as a
model to transcribe voice messages through the shared `tools.media.audio`
88
-
surface.
115
+
Groq's bundled plugin also registers an **audio media-understanding provider** so voice messages can be transcribed through the shared `tools.media.audio` surface.
| Default base URL |`https://api.groq.com/openai/v1`|
121
+
| Default model |`whisper-large-v3-turbo`|
122
+
| Auto priority | 20 |
123
+
| API endpoint | OpenAI-compatible `/audio/transcriptions`|
124
+
125
+
To make Groq the default audio backend:
89
126
90
127
```json5
91
128
{
@@ -100,42 +137,44 @@ surface.
100
137
```
101
138
102
139
<AccordionGroup>
103
-
<Accordiontitle="Audio transcription details">
104
-
| Property | Value |
105
-
|----------|-------|
106
-
| Shared config path | `tools.media.audio` |
107
-
| Default base URL | `https://api.groq.com/openai/v1` |
108
-
| Default model | `whisper-large-v3-turbo` |
109
-
| API endpoint | OpenAI-compatible `/audio/transcriptions` |
110
-
</Accordion>
111
-
112
-
<Accordiontitle="Environment note">
113
-
If the Gateway runs as a daemon (launchd/systemd), make sure `GROQ_API_KEY` is
114
-
available to that process (for example, in `~/.openclaw/.env` or via
115
-
`env.shellEnv`).
140
+
<Accordiontitle="Environment availability for the daemon">
141
+
If the Gateway runs as a managed service (launchd, systemd, Docker), `GROQ_API_KEY` must be visible to that process — not just to your interactive shell.
116
142
117
143
<Warning>
118
-
Keys set only in your interactive shell are not visible to daemon-managed
119
-
gateway processes. Use `~/.openclaw/.env` or `env.shellEnv` config for
120
-
persistent availability.
144
+
A key sitting only in `~/.profile` will not help a launchd or systemd daemon unless that environment is imported there too. Set the key in `~/.openclaw/.env` or via `env.shellEnv` to make it readable from the gateway process.
121
145
</Warning>
122
146
123
147
</Accordion>
148
+
149
+
<Accordiontitle="Custom Groq model ids">
150
+
OpenClaw accepts any Groq model id at runtime. Use the exact id shown by Groq and prefix it with `groq/`. The bundled catalog covers the common cases; uncatalogued ids fall through to the default OpenAI-compatible template.
Copy file name to clipboardExpand all lines: docs/providers/sglang.md
+15-10Lines changed: 15 additions & 10 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,16 +6,21 @@ read_when:
6
6
title: "SGLang"
7
7
---
8
8
9
-
SGLang can serve open-source models via an **OpenAI-compatible** HTTP API.
10
-
OpenClaw can connect to SGLang using the `openai-completions` API.
11
-
12
-
OpenClaw can also **auto-discover** available models from SGLang when you opt
13
-
in with `SGLANG_API_KEY` (any value works if your server does not enforce auth)
14
-
and you do not define an explicit `models.providers.sglang` entry.
15
-
16
-
OpenClaw treats `sglang` as a local OpenAI-compatible provider that supports
17
-
streamed usage accounting, so status/context token counts can update from
18
-
`stream_options.include_usage` responses.
9
+
SGLang serves open-weight models via an OpenAI-compatible HTTP API. OpenClaw connects to SGLang using the `openai-completions` provider family with auto-discovery of available models.
OpenClaw also **auto-discovers** available models from SGLang when you opt in with `SGLANG_API_KEY` and you do not define an explicit `models.providers.sglang` entry — see [Model discovery (implicit provider)](#model-discovery-implicit-provider) below.
0 commit comments