Skip to content

Add opt-in Pro Plan routing profile#1865

Closed
dumbjack wants to merge 8 commits into
Hmbown:mainfrom
dumbjack:codex/pro-plan-mode
Closed

Add opt-in Pro Plan routing profile#1865
dumbjack wants to merge 8 commits into
Hmbown:mainfrom
dumbjack:codex/pro-plan-mode

Conversation

@dumbjack

@dumbjack dumbjack commented May 21, 2026

Copy link
Copy Markdown

Summary

  • Add Pro Plan as an explicit /mode pro-plan routing profile, not a fourth default mode in the Tab cycle or /mode picker.
  • Keep the existing Plan Confirmation gate: the user chooses Pro Plan and chooses whether accepted execution uses Agent-style approvals or a one-pass YOLO execution.
  • Route phases across existing mode semantics: Plan/Review use the resolved Pro model with read-only Plan policy; Execute uses the resolved Flash model with Agent policy, or temporary YOLO policy only for the accepted execution pass.
  • Fail closed when routing is unavailable or bypassed: raw AppMode::ProPlan maps to read-only Plan behavior, and missing Flash routes fall back to the resolved Pro model.
  • Reduce false positives and loop risk: the planning trigger now only wraps clearly implementation-like requests, and review-requested reimplementation clears temporary YOLO auto-approval before returning to Execute.

Why this helps

This follows the same broad workflow idea as Claude Code's public Opus Plan pattern: use the strongest model where planning and review have the most leverage, then use a faster model for the implementation pass. In CodeWhale this is deliberately opt-in so the main model/menu surface stays neutral and users can tell when they are choosing a workflow versus when CodeWhale is choosing a phase model inside that workflow.

The profile is intended for larger or riskier coding tasks, not as a default for tiny edits.

Safety / Reviewability

  • Default visible modes remain Plan, Agent, and YOLO.
  • /mode pro-plan is the explicit entry point; numeric mode shortcuts remain 1, 2, and 3 for the visible picker rows.
  • Plan/Review are read-only and reuse existing Plan prompt/policy paths.
  • Execute uses existing Agent/YOLO policy paths after the plan confirmation choice.
  • Interrupted or failed turns do not advance Pro Plan phase state.
  • Entering Pro Plan saves auto_model, disables it while routed by phase, and restores it on exit.
  • User-facing status/help/PlanPrompt strings remain localized.

Validation

  • cargo fmt --all -- --check
  • cargo check -p codewhale-tui
  • cargo test -p codewhale-tui pro_plan --quiet (20 passed)
  • cargo test -p codewhale-tui plan_prompt --quiet (6 passed)
  • cargo test -p codewhale-tui localization --quiet (8 passed)
  • cargo test -p codewhale-tui --quiet (full codewhale-tui pass: 4006 passed, 4 ignored, plus integration targets)

Greptile Summary

This PR introduces AppMode::ProPlan, a new opt-in routing profile that plans and reviews with deepseek-v4-pro and executes with deepseek-v4-flash (falling back to the Pro model when Flash is unavailable on the active provider). It wires the existing Plan Confirmation gate into the flow so execution never starts without explicit user approval.

  • State machine: A new ProPlanRouter (pro_plan.rs) drives the four-phase lifecycle (Plan → Execute → Review → Done) via XML markers emitted by the model; phase transitions are gated on TurnOutcomeStatus::Completed only, preventing drift on aborted turns.
  • Tool enforcement: should_force_update_plan_step replaces the old turn.tool_calls.is_empty() check, so the update_plan gate stays active until a successful call is made rather than lifting after any first tool error.
  • Provider routing: ProPlanConfig::for_provider resolves and validates the Flash route at mode-entry time, defaulting to Pro when the active provider does not advertise Flash.
  • Localization: New MessageId variants cover all user-visible Pro Plan strings across six shipped locales; model-facing instructions remain hardcoded English.

Confidence Score: 5/5

Safe to merge. The raw ProPlanMode fails closed to read-only tools and Never approval on every path that bypasses phase resolution, and phase transitions are correctly gated on completed turns only.

The new state machine, provider routing, tool-gate enforcement, and UI wiring are all well-covered by tests. The same-mode guard in set_mode prevents restore-value corruption. The one notable finding — common action words triggering unnecessary planning-instruction injection on question turns — does not produce incorrect behaviour because the injected instruction tells the model to answer normally for questions.

crates/tui/src/tui/ui.rs — the should_add_pro_plan_planning_instruction word-list would benefit from a question-start short-circuit to reduce context noise on long planning sessions.

Important Files Changed

Filename Overview
crates/tui/src/tui/pro_plan.rs New file: ProPlanRouter state machine with four phases (Plan/Execute/Review/Done), provider-aware model config, and XML-marker-based transitions. Well-tested with 10 unit tests covering all phase edges including replan and review rejection.
crates/tui/src/tui/ui.rs Core event loop integration for ProPlan: phase-aware mode resolution, planning instruction injection, and automatic follow-up dispatch. The should_add_pro_plan_planning_instruction helper uses a broad single-word list that can false-positive on question-framed messages containing common words like "update", "add", or "fix".
crates/tui/src/core/engine/dispatch.rs Adds should_force_update_plan_step which correctly closes a hole in the old enforcement: blocked tool calls no longer satisfy the first-step check, so the gate stays active until update_plan actually succeeds.
crates/tui/src/core/engine/turn_loop.rs Replaces the old turn.tool_calls.is_empty() gate with should_force_update_plan_step, adding a mid-turn reminder injection when non-update_plan tools are attempted. Behavior is more strictly enforced but bounded by the existing loop guard.
crates/tui/src/tui/app.rs Adds ProPlan variant to AppMode, pro_plan_router and pro_plan_restore_auto_model fields, and all required match exhaustiveness. The same-mode guard (previous_mode == mode) prevents restore-value corruption on re-entrant set_mode calls.
crates/tui/src/tui/plan_prompt.rs Migrates all hardcoded string literals to MessageId lookups and adds a Locale field. PlanPromptView::new is correctly restricted to #[cfg(test)]; production callers use new_for_locale_with_plan.
crates/tui/src/core/engine/tool_setup.rs ProPlan added to the read_only_mode group alongside Plan, ensuring it gets read-only file tools and slop-ledger-read-only access by default.
crates/tui/src/localization.rs Adds 25 new MessageId variants with translations across English, Vietnamese, Japanese, Simplified Chinese, Brazilian Portuguese, and Spanish (LatAm). ALL_MESSAGE_IDS array kept in sync.
crates/tui/src/tui/ui/tests.rs Adds targeted tests for ProPlan: phase tracking, aborted-turn safety (Interrupted/Failed do not advance phase), planning-instruction gating, and YOLO accept flow.
crates/tui/src/prompts.rs ProPlan reuses PLAN_MODE prompt and ApprovalMode::Never, so raw engine turns stay read-only by default.

Sequence Diagram

sequenceDiagram
    participant U as User
    participant UI as TUI
    participant R as ProPlanRouter
    participant EP as Engine Plan tools
    participant EA as Engine Agent tools
    participant M as Model

    U->>UI: /mode pro-plan
    UI->>R: new(ProPlanConfig::for_provider)
    UI-->>U: "Phase=Plan, Model=deepseek-v4-pro"

    U->>UI: Add feature X
    UI->>UI: append pro_plan_planning instruction
    UI->>EP: "SendMessage(mode=Plan, model=pro)"
    EP->>M: request with update_plan gate
    M->>EP: calls update_plan
    EP-->>UI: TurnCompleted
    UI->>R: apply_turn_completion sets has_generated_plan
    UI-->>U: Plan Confirmation prompt

    U->>UI: Accept plan
    UI->>R: start_execution(auto_approve)
    UI->>EA: "follow-up Proceed mode=Agent model=flash"
    EA->>M: execute with write tools
    M->>EA: changes plus execute_complete marker
    EA-->>UI: TurnCompleted
    UI->>R: transition to Review
    UI->>EP: "auto follow-up Review implementation mode=Plan model=pro"
    EP->>M: review read-only
    M->>EP: review approved or changes_requested
    EP-->>UI: TurnCompleted
    UI->>R: transition to Done or Execute
    UI-->>U: PRO-PLAN Done
Loading

Fix All in Codex Fix All in Claude Code Fix All in Cursor

Reviews (3): Last reviewed commit: "Merge remote-tracking branch 'origin/mai..." | Re-trigger Greptile

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces 'Pro Plan' mode, a model-routing workflow that utilizes deepseek-v4-pro for planning and review phases and deepseek-v4-flash for implementation. It includes a new ProPlanRouter state machine to manage phase transitions and updates the TUI and engine to enforce plan confirmation, handle automatic model switching, and provide phase-specific status updates. Feedback identifies the use of unstable Rust 'let chains' which could cause compilation errors on stable toolchains. Additionally, several new user-facing strings and labels are hardcoded in English and should be moved to the localization system to maintain internationalization standards.

Comment thread crates/tui/src/tui/ui.rs Outdated
Comment on lines +740 to +741
if let Some(router) = app.pro_plan_router.as_mut()
&& router.phase() == crate::tui::pro_plan::ProPlanPhase::Done

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The use of 'let chains' (if let ... && ...) is an unstable Rust feature (see RFC 2497). Unless this repository explicitly requires a nightly toolchain via a rust-toolchain.toml file, this will cause compilation errors on stable Rust. Please refactor these to use nested if let statements or matches!. This pattern also appears on lines 1582, 1633, and 5590.

Comment thread crates/tui/src/tui/ui.rs Outdated
Comment on lines +1640 to +1645
crate::tui::pro_plan::ProPlanFollowUp::ReviewImplementation => {
"Review the implementation against the accepted plan. Do not edit files during review. If it is correct, include `<pro_plan review=\"approved\">`; if changes are needed, include `<pro_plan review=\"changes_requested\">` and list the fixes."
}
crate::tui::pro_plan::ProPlanFollowUp::AddressReviewFeedback => {
"Address the review feedback using the accepted plan, then summarize the changes and include `<pro_plan execute_complete=\"true\">`."
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

These follow-up messages and status labels are hardcoded in English. To maintain consistency with the project's internationalization (i18n) standards, these strings should be moved to MessageId in crates/tui/src/localization.rs and retrieved using the tr() function. This applies to several new user-facing strings in this file, including:

  • Phase labels (lines 1654-1657)
  • System messages (lines 5532, 5566, 5601)
  • Tool blocking status (line 1976)

Comment thread crates/tui/src/tui/plan_prompt.rs Outdated
Comment on lines 12 to 21
("Accept plan", "Start implementation with approvals"),
(
"Accept plan (YOLO)",
"Start implementation in YOLO mode (auto-approve)",
"Start implementation with auto-approval",
),
("Revise plan", "Ask follow-ups or request plan changes"),
(
"Exit Plan mode",
"Exit to Agent",
"Return to Agent mode without implementation",
),

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The plan option labels and descriptions are hardcoded English strings. These should be localized using the established MessageId and tr() pattern to ensure they are translated correctly for users in different locales.

@Hmbown Hmbown added this to the v0.8.44 milestone May 21, 2026
@dumbjack

Copy link
Copy Markdown
Author

Addressed the current review pass in e915c79/e5efe93:

  • Refactored the new ProPlan let-chain sites in tui.rs into nested control flow to avoid toolchain/reviewer ambiguity.
  • Routed the Plan Confirmation modal through the existing localization registry and added translations for shipped locales.
  • Localized the ProPlan phase status labels. The queued follow-up strings remain English intentionally because they are model-steering instructions with exact <pro_plan ...> control tags, not user-facing UI chrome.
  • Merged latest origin/main and resolved the ui.rs approval conflict while preserving ProPlan read-only turn blocking.

Local validation:

  • cargo fmt
  • cargo check -p deepseek-tui
  • cargo test -p deepseek-tui localization --quiet
  • cargo test -p deepseek-tui plan_prompt --quiet
  • cargo test -p deepseek-tui pro_plan --quiet
  • cargo test -p deepseek-tui --quiet

@dumbjack dumbjack changed the title [codex] Harden Pro Plan mode [codex] Add Pro Plan mode with review gate May 22, 2026
@dumbjack dumbjack marked this pull request as ready for review May 22, 2026 08:30
@dumbjack

dumbjack commented May 22, 2026

Copy link
Copy Markdown
Author

Polished this for reviewability in 872691c:

  • Removed the unlinked benchmark/eval notes from the PR diff, reducing the review surface by 400+ lines while keeping the user-facing Pro Plan docs.
  • Reworded Pro Plan docs/comments to describe the feature neutrally as plan/execute/review routing.
  • Made Pro Plan model display show the currently routed model (pro-plan: deepseek-v4-pro / pro-plan: deepseek-v4-flash).
  • Added a regression test proving Pro Plan temporarily disables auto-model routing and restores the prior auto-model state on exit.
  • Re-ran cargo fmt, cargo check -p deepseek-tui, cargo test -p deepseek-tui pro_plan --quiet, and cargo test -p deepseek-tui --quiet.

The PR is now marked ready for review.

@dumbjack dumbjack changed the title [codex] Add Pro Plan mode with review gate Add review-gated Pro Plan mode May 22, 2026
@dumbjack dumbjack changed the title Add review-gated Pro Plan mode Add Pro Plan model routing for plan-first changes May 22, 2026
@Hmbown

Hmbown commented May 27, 2026

Copy link
Copy Markdown
Owner

Independent review:

Read the routing logic (crates/tui/src/tui/pro_plan.rs, tui/ui.rs::effective_mode_for_turn, engine tests). Notes:

Design. 4-state FSM (Plan->Execute->Review->Done) gated on explicit <pro_plan ...> XML control markers, not natural language. Plan/Review run deepseek-v4-pro; Execute runs deepseek-v4-flash. Mode resolution maps Plan/Review->AppMode::Plan (read-only), Execute->AppMode::Agent (or Yolo if auto-approved). Raw AppMode::ProPlan fails closed to Plan tools/sandbox (good).

Cost. Each full cycle is Pro(plan) + Flash(exec) + Pro(review) - measurably more expensive than auto-model on small tasks, cheaper than all-Pro. Important: entering ProPlan force-disables auto_model and restores on exit (test set_mode_pro_plan_temporarily_disables_auto_model_and_restores_on_exit), so it's opt-in not implicit. Single-turn Q&A in Plan phase doesn't trigger plan confirmation.

Mode interaction. ProPlan added to the mode cycle (Plan->Agent->YOLO->ProPlan->Plan). Auto-model save/restore is the only state-coupling risk - if mode-set panics mid-transition, restore won't run.

Tests. 10 unit tests on the FSM, 1 integration test on auto-model save/restore, fail-closed tests for AppMode::ProPlan registry + sandbox. Solid coverage of the decision tree; no test for the cost-routing fail-closed if the v4-flash model is unavailable.

v0.8.48 merge. CONFLICTING: ui.rs (156-line block - new ProPlan helpers vs PR #2256 sidebar work), plan_prompt.rs (2), engine/tests.rs (1), README, docs/MODES.md. All mechanical but ui.rs touches the same dispatch path as v0.8.48.

@Hmbown

Hmbown commented May 31, 2026

Copy link
Copy Markdown
Owner

Thanks @dumbjack — I re-checked this during the harvest pass. I am leaving it unmerged for now because it is a large mode/engine/UI change, currently conflicts with main, and only has the lightweight security check on the current PR surface.

What would make this mergeable:

  • rebase/merge current main and resolve the TUI/engine conflicts
  • run the current package names/checks (cargo fmt --all -- --check, focused pro_plan tests, and a broader codewhale-tui check/test pass)
  • confirm the earlier review items are still handled on stable Rust and through localization rather than hardcoded user-facing strings

The idea is still interesting; it just needs a fresh integration pass before I can responsibly land it.

@Hmbown

Hmbown commented May 31, 2026

Copy link
Copy Markdown
Owner

Thanks @dumbjack. I rechecked this against current main: the Pro Plan direction still fits CodeWhale well, especially the DeepSeek-native plan/review vs execute routing, but the branch is now conflicting across the engine, tool setup, plan prompt, UI, README, and modes docs.

I wouldn’t want to merge or partially harvest this until it is rebased and revalidated on the current codewhale-tui surfaces. The safest next step is a fresh integration pass that keeps the ProPlanRouter tests, re-resolves the current tool-policy/UI conflicts, and reruns cargo fmt --all -- --check, focused pro_plan tests, and a broader codewhale-tui check/test pass.

@dumbjack dumbjack changed the title Add Pro Plan model routing for plan-first changes Add Pro Plan mode for Pro planning/review and Flash execution May 31, 2026
@dumbjack

Copy link
Copy Markdown
Author

Fresh integration pass pushed in db795e7.

What changed for the current review concerns:

  • Merged current main and resolved the engine/tool setup/PlanPrompt/UI/README/MODES conflicts against the current CodeWhale surfaces. The PR now reports mergeable.
  • Kept ProPlanRouter and the explicit marker-driven Plan -> Execute -> Review -> Done flow, while preserving the current PlanPrompt snapshot rendering and sidebar/task UI changes from main.
  • Added provider-aware Flash route fail-closed behavior: if the active provider does not advertise a usable deepseek-v4-flash route, Execute falls back to the resolved Pro model instead of sending an unavailable model id. Covered with a Fireworks fallback test and an OpenRouter Flash route test.
  • Kept raw AppMode::ProPlan fail-closed in tool registry, SlopLedger tools, RLM/FIM exposure, sandbox policy, and approval blocking.
  • Moved the remaining user-facing Pro Plan/PlanPrompt/read-only status strings through localization, including the new Vietnamese locale from current main.
  • Updated PR title/body and docs to describe the feature neutrally as DeepSeek-native plan/review vs execute routing.

Validation on stable Rust 1.95.0:

  • cargo fmt --all -- --check
  • cargo check -p codewhale-tui
  • cargo test -p codewhale-tui pro_plan --quiet
  • cargo test -p codewhale-tui plan_prompt --quiet
  • cargo test -p codewhale-tui localization --quiet
  • cargo test -p codewhale-tui --quiet

Note: the first broad test pass hit a one-off local MCP SSE connection reset in mcp::tests::legacy_sse_closed_stream_reconnects_and_retries_tool_call; the focused rerun passed immediately, and the full codewhale-tui rerun passed cleanly.

Comment thread crates/tui/src/tui/ui.rs Outdated
Comment thread crates/tui/src/commands/core.rs
Comment thread crates/tui/src/tui/pro_plan.rs Outdated
@Hmbown

Hmbown commented May 31, 2026

Copy link
Copy Markdown
Owner

Thanks @dumbjack, this is much closer now. I rechecked the latest head and the Pro Plan design still fits CodeWhale well: Pro for plan/review, Flash for execution, with the existing Plan Confirmation gate and fail-closed read-only paths.\n\nI’m going to hold off on merging until the required CodeWhale checks actually run and pass on this head. GitHub reports the branch as mergeable, but the PR is still blocked because the required ruleset checks are missing here. Before landing I’d also like one final maintainer pass over the broad planning-trigger heuristic and the YOLO re-implementation loop semantics.\n\nSuggested validation: cargo fmt --all -- --check, cargo check -p codewhale-tui, focused pro_plan / plan_prompt / localization tests, and a full cargo test -p codewhale-tui --quiet.

@Hmbown

Hmbown commented Jun 1, 2026

Copy link
Copy Markdown
Owner

Thank you for exploring the Pro Plan idea. The general goal still fits: CodeWhale should help route work thoughtfully across model strengths. I am leaving this unmerged for now because the PR is blocked and the current product direction is to be less opinionated in the main model/menu surface, giving users clearer control over choices.

A refreshed version could be very compelling if it moves the policy-heavy routing into opt-in profiles or subagent defaults, keeps the main menu neutral, and explains when the tool is choosing versus when the user is choosing. That would keep the useful intelligence without making the default experience feel over-prescribed.

# Conflicts:
#	crates/tui/src/commands/config.rs
#	crates/tui/src/core/engine/tests.rs
#	crates/tui/src/localization.rs
#	crates/tui/src/tui/app.rs
#	crates/tui/src/tui/ui.rs
#	docs/MODES.md
@dumbjack dumbjack changed the title Add Pro Plan mode for Pro planning/review and Flash execution Add opt-in Pro Plan routing profile Jun 7, 2026
@Hmbown

Hmbown commented Jun 10, 2026

Copy link
Copy Markdown
Owner

Closing, @dumbjack — thank you for the Pro Plan routing exploration. As discussed earlier in the thread, automatic plan-based routing doesn't match the direction we want for default behavior. If you'd like to rework it as an explicit opt-in routing profile (config-gated, no default change), we'd welcome that as a fresh PR.

@Hmbown Hmbown closed this Jun 10, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants