refactor(config): drop preset abstraction, expose model + effort directly by esengine · Pull Request #1657 · esengine/DeepSeek-Reasonix

esengine · 2026-05-24T05:25:25Z

Summary

Presets bundled (model, reasoning_effort, thinking) under names like flash / pro / auto. The bundling forced "max" as the cap and silently broke OpenAI-compatible endpoints that only accept the standard literal set — a user reported their self-hosted vLLM rejecting requests with:

1 validation error: reasoning_effort — Input should be 'none', 'low', 'medium' or 'high'

This PR rips out the preset layer and exposes model + effort directly across all three surfaces.

ReasoningEffort widens to low | medium | high | max
Default cap is now "high" — accepted by every OpenAI-compatible endpoint (vLLM, Azure, DeepSeek). "max" stays available; users opt in knowing it's a DeepSeek extension
Dashboard + Desktop: settings panel shows a model picker + four effort buttons; both apply live through the existing settings bridge
CLI: /model and /effort slash commands; ModelPicker shows EFFORT and MODELS sections with full enumeration
Config: preset field removed, model persisted directly

55 files changed, +890 / -1153.

Test plan

npm run typecheck (root + dashboard) clean
npm run lint clean (2 pre-existing warnings unrelated to this PR)
npm test — 3579 pass, 0 fail
npm run build:dashboard produces the expected dist/ (verified via dashboard-smoke tests)
Manual: /model and /effort in CLI apply live and persist across restart
Manual: dashboard settings page model + effort changes take effect immediately
Manual: desktop settings page model + effort changes take effect immediately
Manual: pointing CLI at a vLLM endpoint with default config succeeds (no more "max" rejection)

Closes the user-reported vLLM 400.

…ctly Presets bundled (model, reasoning_effort, thinking) under names like flash/pro/auto. The bundling forced "max" as the cap and silently broke OpenAI-compatible endpoints that only accept the standard literal set (vLLM rejects "max" with a 400 — saw it in the wild against a self-hosted DeepSeek-V4-Pro). Direct controls instead: - ReasoningEffort widens to low | medium | high | max - Default cap is now "high" — the value every OpenAI-compatible endpoint accepts; users opt into "max" knowing it's DeepSeek-only - Dashboard + desktop settings: model picker + effort buttons, both apply live through the existing settings bridge - CLI: /model and /effort slash commands; ModelPicker shows EFFORT and MODELS sections with full enumeration - Config: preset field removed, model field persisted directly

…moved, persisted usage stats, plan dispatch gate Headline themes: - Desktop: bundle the CLI-hosted React dashboard, retire Tauri+Preact duplicate (#1418) - Config: drop preset abstraction; flash/pro are direct model selections (#1657, #1630) - Stats: persist cumulative usage to session meta + auto-restore on startup (#1667, #1680, #1643, #1628) - Plans: editMode="plan" enforced at the ToolRegistry dispatch gate (#1681); step advance fix (#1629) - Context: fold once at turn start, drop pre-flight + byte-ceiling (#1642, #1646); collapsible compacted card (#1649) - Subagents: per-skill flash/pro override + Settings UI (#1632) - Desktop polish: sidebar drag-resize (#1688), responsive collapse (#1585), copy/edit overlay + msg-history nav (#1645), Esc closes modal not turn (#1685), QQ tab isolation (#1672), DiffCard for edits (#1662), theme-aware highlighting (#1655), system events toggle (#1654/#1650), macOS TCC inheritance (#1614), dashboard.enabled (#1612) - Dashboard polish: persistent session URL (#1586, #1589, #1599), theme-aware highlighting (#1664), IME confirm-enter guard (#1689), code-fence lang fix (#1677), vendor chunk split (#1587), markdown table h-scroll (#1562) - TUI: Alt+S input stash/recall; static history isolated from input rerenders (#1635); legacy mouse drop (#1637, #1648); multi-edit gated in review (#1647) - Diff: SplitDiff column border holds under CJK (#1686) - MCP: workspace roots passed to servers (#1625); codeCommand honors mcpServers (#1603) - Config plumbing: (baseUrl, apiKey) resolved as a tuple (#1658); stale model id self-heal (#1663) See CHANGELOG for the full list.

@FriendsHL

Three stale-doc fixes: - ARCHITECTURE.md §4.3 — replace removed /pro single-turn arming with the current /model flash|pro + settings.json model selection. Note the removal in 0.50.0 (#1657, #1630). - ARCHITECTURE.md §4.4 — replace the never-existed FAILURE_ESCALATION_THRESHOLD counter with the actual <<<NEEDS_PRO>>> model self-report mechanism. No failure-counter; purely LLM-initiated, no-op on pro tier. - benchmarks/real-world-cache/README.md — fix 10× pricing error in v4-flash cache-hit ($0.028 → $0.0028) and entirely wrong v4-pro pricing ($0.139/1.667/3.333 → $0.003625/0.435/0.87). Recalculated cost tables; headline 99.82% hit ratio unchanged, savings now correctly show ~97.7% (flash) / ~98.9% (pro). Thanks @FriendsHL for catching this — the benchmark pricing in particular is the public cache-first defense link, the old numbers would have been embarrassing.

#1657 dropped the preset abstraction and exposed reasoning effort directly, keeping `max` as a DeepSeek-only extension that users opt into knowing standard OpenAI / vLLM / Azure reject it with 400. The TUI still advertised `max` everywhere — `/effort` argsHint, slash-arg picker, ModelPicker effort rows, /effort handler — even when the active endpoint was a third-party host that can't accept it. Users on those endpoints saw `max` in every suggestion and reported it as a preset-era leftover (#1794). Endpoint-aware filter: when `loop.client.baseUrl` is not api.deepseek.com, drop `max` from the choices the TUI surfaces: - `/effort` argsHint and argCompleter (autocomplete + arg picker) - ModelPicker effort rows - /effort handler's accept list + status / usage message - new `effortUsageNoMax` i18n key (EN / zh-CN / de) so the error on bad input doesn't itself name `max` as an option `max` stays available on DeepSeek endpoints — that's the design from #1657, just no longer visible where it would 400. Fixes #1794. Co-authored-by: reasonix <reasonix@deepseek.com>

esengine merged commit 88fc19d into main May 24, 2026
4 checks passed

esengine deleted the worktree-agent-a35113ae branch May 24, 2026 05:37

FriendsHL mentioned this pull request May 25, 2026

docs: sync ARCHITECTURE.md and benchmark pricing to match code #1720

Merged

This was referenced May 25, 2026

/effort 命令不能使用 #1284

Closed

fix(tui): hide max effort on non-DeepSeek endpoints #1798

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor(config): drop preset abstraction, expose model + effort directly#1657

refactor(config): drop preset abstraction, expose model + effort directly#1657
esengine merged 1 commit into
mainfrom
worktree-agent-a35113ae

esengine commented May 24, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

esengine commented May 24, 2026

Summary

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant