feat(loop): /max-tokens slash command to cap output tokens per turn (#2196)#2207
Conversation
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 7d1e89e9bf
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| toolSpecs, | ||
| signal, | ||
| reasoningEffort: this.reasoningEffort, | ||
| maxTokens: this.maxOutputTokens, |
There was a problem hiding this comment.
Apply /max-tokens to forced summaries
When /max-tokens is set and a turn hits the iteration cap, context guard, or all-suppressed storm path, the loop jumps to forceSummaryAfterIterLimit(), whose ctx.client.chat() call in src/loop/force-summary.ts still omits maxTokens. That summary generation is part of the same user turn, so the exact runaway-tool-loop scenarios this command is meant to control can still make an uncapped model call; please thread this.maxOutputTokens through summaryContext/ForceSummaryContext and pass it to the summary chat request too.
Useful? React with 👍 / 👎.
…sengine#2196) Adds /max-tokens <N|off> slash command and persisted maxOutputTokens config field so users can limit output tokens per turn, preventing runaway reasoning. - config: maxOutputTokens field, load/saveMaxOutputTokens accessors - loop: stored on loop instance, passed as max_tokens to chat and stream calls - loop/streaming.ts: maxTokens forwarded through StreamModelOptions - slash: new /max-tokens setup-group command with handler + i18n (EN/zh-CN/de) - App.tsx: seeded from config on loop construction
7d1e89e to
3ecd008
Compare
|
Fixed in @codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 3ecd00840c
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| hooks: hookList, | ||
| hookCwd: currentRootDir, | ||
| reasoningEffort: initialReasoningEffort ?? loadReasoningEffort(), | ||
| maxOutputTokens: loadMaxOutputTokens(), |
There was a problem hiding this comment.
Propagate persisted maxOutputTokens to all loop constructors
This only seeds the persisted cap for the TUI loop. I checked the other production CacheFirstLoop constructors that already load persisted per-turn settings (src/cli/commands/desktop.ts buildRuntimeFor loads loadReasoningEffort()/loadMaxIterPerTurn(), and src/cli/commands/run.ts loads loadMaxIterPerTurn()), and neither passes loadMaxOutputTokens(), so a user who saved /max-tokens 4096 can restart into desktop or reasonix run and silently get uncapped model calls despite the config field being persisted. Please thread loadMaxOutputTokens() through those runtime constructors as well so the saved cap is consistently honored.
Useful? React with 👍 / 👎.
esengine
left a comment
There was a problem hiding this comment.
Reviewed — clean and correctly plumbed. maxOutputTokens flows config → loop.configure → this.maxOutputTokens → maxTokens: this.maxOutputTokens on the client.chat call (both call sites), with undefined = no cap (server default). The /max-tokens handler covers bare (show), a number (set), and off (clear), persists to config, and i18n is in EN/de/zh-CN. Closes a real gap (#2196 — runaway reasoning burning 80K+ tokens). CI green. Merging.
Problem
Closes #2196. There was no way to limit how many tokens the model generates per turn. Users with long reasoning tasks could burn 80K+ output tokens in a single turn with no recourse.
Change
Adds a
/max-tokens <N|off>slash command (in the setup group, next to/effortand/budget) and a persistedmaxOutputTokensconfig field:The value is sent as
max_tokensin the API request and persists across restarts via~/.reasonix/config.json.Files changed:
src/config.ts— field +loadMaxOutputTokens/saveMaxOutputTokenssrc/loop.ts—maxOutputTokenson options + mutable field +configure()supportsrc/loop/streaming.ts— forwarded throughStreamModelOptionssrc/cli/ui/slash/commands.ts— new/max-tokenscommand definitionsrc/cli/ui/slash/handlers/model.ts— handler implementationsrc/cli/ui/App.tsx— seeded from config on loop construction