Skip to content

feat(cli): enhance prompt handling and synthetic task follow-up#2797

Merged
Devesh36 merged 3 commits into
Tracer-Cloud:mainfrom
Devesh36:improvements
Jun 12, 2026
Merged

feat(cli): enhance prompt handling and synthetic task follow-up#2797
Devesh36 merged 3 commits into
Tracer-Cloud:mainfrom
Devesh36:improvements

Conversation

@Devesh36

Copy link
Copy Markdown
Collaborator

This pull request introduces significant improvements to how the interactive CLI prompt behaves, especially in terms of contextual placeholder text, prompt prefill after synthetic test failures, and overall session state management. The changes enhance user experience by making the prompt more informative and responsive to session events, and by consolidating logic related to synthetic test follow-ups. Additionally, several code paths are refactored for clarity, and new tests are added to ensure correct behavior.

Prompt placeholder and refresh improvements:

  • Added resolve_prompt_placeholder to generate dynamic placeholder text based on session state (e.g., trust mode, running tasks, resumed session), replacing the previous static placeholder. [1] [2]
  • Introduced wire_prompt_refresh, allowing the prompt to be programmatically refreshed/redrawn when session state changes. [1] [2]
  • The prompt now uses a callable for the placeholder, ensuring it always reflects the latest session state.

Synthetic test failure follow-up and session state management:

  • Moved and refactored logic for suggesting a follow-up prompt and binding the last synthetic observation after a failed synthetic test into new ReplSession methods: suggest_synthetic_failure_follow_up and _bind_last_synthetic_observation. This replaces the previous scattered logic and improves maintainability. [1] [2] [3] [4] [5]
  • The session now notifies the prompt to refresh whenever a synthetic test fails, ensuring the user sees the suggested follow-up immediately. [1] [2]

Task registry and session enhancements:

  • Added a running_count method to TaskRegistry to efficiently count currently running tasks, supporting the new dynamic placeholder.
  • The session tracks a prompt_refresh_fn callback, which is invoked to update the prompt when needed.

Code cleanup and refactoring:

  • Removed now-unnecessary utility functions (_scenario_id_from_synthetic_suite_name, _try_bind_synthetic_observation) from synthetic_tasks.py and cleaned up related imports and exports. [1] [2] [3]
  • Improved scenario ID extraction by introducing _scenario_id_from_synthetic_label, which handles both flag and label formats.

Testing improvements:

  • Added comprehensive tests for the new prompt placeholder logic, covering various session states and combinations.
  • Updated session tests to cover the new scenario ID extraction and synthetic follow-up logic.
Screenshot 2026-06-11 at 10 39 05 AM Screenshot 2026-06-11 at 10 39 41 AM Screenshot 2026-06-11 at 10 43 32 AM Screenshot 2026-06-11 at 10 44 12 AM Screenshot 2026-06-11 at 10 44 36 AM Screenshot 2026-06-11 at 11 06 59 AM

@github-actions

Copy link
Copy Markdown
Contributor

Greptile code review

This repo uses Greptile for automated review. Before merge, aim for Confidence Score: 5/5 with zero unresolved review threads — see CONTRIBUTING.md.

Run a review — add a PR comment with:

@greptile review

Give it ~5-10 minutes (sometimes longer) for results, then fix feedback and re-trigger until you reach Confidence Score: 5/5.

Optional: automate with the greploop skill.

@greptile-apps

greptile-apps Bot commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR consolidates synthetic-test failure handling into new ReplSession methods (suggest_synthetic_failure_follow_up, _bind_last_synthetic_observation) and introduces dynamic prompt placeholder text via resolve_prompt_placeholder and wire_prompt_refresh.

  • suggest_synthetic_failure_follow_up is now the single call-site for queuing the RCA prefill, adding the history-generation guard to background_tasks.py (which previously lacked it) and extracting _scenario_id_from_synthetic_label with proper format validation.
  • wire_prompt_refresh wires a call_soon_threadsafe callback into the session so background threads can atomically inject prefill text into a live prompt and invalidate its placeholder without touching event-loop state directly.
  • TaskRegistry.running_count() provides a lock-safe task count for the hot-path placeholder renderer.

Confidence Score: 5/5

Safe to merge — no production bugs introduced by this PR.

The history-generation guard is now consistently applied in both background_tasks.py and synthetic_tasks.py, eliminating the stale-watcher risk that existed before. _scenario_id_from_synthetic_label validates scenario IDs against the same format regex before constructing file paths. The wire_prompt_refresh callback is event-loop-safe via call_soon_threadsafe, and the take_pending_prompt_default / current_buffer.text pairing handles all prompt-lifecycle races correctly.

No files require special attention.

Important Files Changed

Filename Overview
app/cli/interactive_shell/runtime/session.py Adds suggest_synthetic_failure_follow_up, _bind_last_synthetic_observation, notify_prompt_changed, and _scenario_id_from_synthetic_label. _SYNTHETIC_SCENARIO_ID_RE is now defined here and also in synthetic_tasks.py (duplication). The blocking observation poll runs in background watcher threads, so it does not stall the event loop.
app/cli/interactive_shell/routing/handle_message_with_agent/orchestration/action_executor/background_tasks.py Adds history_gen_when_watch_started guard and suggest_follow_up flag. Non-timeout path still double-joins output threads (pre-existing), but read_diag is correctly called after the first join at line 153, so stderr is fully captured before diagnosis.
app/cli/interactive_shell/prompting/prompt_surface.py Adds resolve_prompt_placeholder (dynamic ANSI placeholder) and wire_prompt_refresh (event-loop-safe prompt invalidation hook). _build_prompt_session now accepts and uses a session for the callable placeholder.
app/cli/interactive_shell/runtime/loop.py Wires wire_prompt_refresh, passes session to _build_prompt_session, and adds prefilled/placeholder arguments to each prompt_async call. Flow is clean and consistent with the new session methods.
app/cli/interactive_shell/routing/handle_message_with_agent/orchestration/action_executor/synthetic_tasks.py Removes _scenario_id_from_synthetic_suite_name and _try_bind_synthetic_observation; delegates to session.suggest_synthetic_failure_follow_up. Retains its own copy of _SYNTHETIC_SCENARIO_ID_RE (used by run_synthetic_test).
app/cli/interactive_shell/runtime/tasks.py Adds running_count() with lock — straightforward and correct for the hot-path placeholder use.
tests/cli/interactive_shell/runtime/test_session.py New tests cover scenario-ID extraction and follow-up prefill. test_suggest_synthetic_failure_follow_up_sets_pending may block up to ~480 ms if SYNTHETIC_SCENARIOS_DIR is importable but the observation file does not exist in the test environment.
tests/cli/interactive_shell/orchestration/test_action_executor.py New test_start_background_cli_task_skips_follow_up_after_session_reset correctly verifies the history-generation guard by running the deferred watcher after session.clear().
tests/cli/interactive_shell/prompting/test_prompt_surface.py Comprehensive tests for all placeholder states and combinations. Clean and correct.

Sequence Diagram

sequenceDiagram
    participant BG as Background Watcher Thread
    participant Session as ReplSession
    participant EventLoop as asyncio Event Loop
    participant PT as prompt_toolkit App

    Note over BG: Synthetic or CLI task exits non-zero
    BG->>Session: suggest_synthetic_failure_follow_up(label)
    Session->>Session: set pending_prompt_default
    Session->>Session: notify_prompt_changed (1st)
    Session-->>EventLoop: call_soon_threadsafe(_apply)
    Session->>Session: _bind_last_synthetic_observation (blocks up to 480ms)
    Session->>Session: notify_prompt_changed (2nd)
    Session-->>EventLoop: call_soon_threadsafe(_apply)
    EventLoop->>PT: _apply - set current_buffer.text
    EventLoop->>PT: pt_app.invalidate
    Note over EventLoop: Next prompt_async iteration
    EventLoop->>Session: take_pending_prompt_default
    Session-->>EventLoop: empty string (consumed by _apply)
    EventLoop->>PT: prompt_async with default and placeholder callable
Loading

Reviews (3): Last reviewed commit: "fix(cli): join output streams before rea..." | Re-trigger Greptile

Comment thread app/cli/interactive_shell/runtime/session.py
Comment thread app/cli/interactive_shell/runtime/session.py
…low-up logic

- Updated prompt session initialization to use the session parameter.
- Enhanced synthetic task follow-up to skip unnecessary notifications after session resets.
- Refined scenario ID extraction logic to ensure valid synthetic scenario IDs.
- Added tests to verify new behavior in background CLI task handling and scenario ID extraction.
@Devesh36

Copy link
Copy Markdown
Collaborator Author

@greptile-apps review again

Mirror synthetic_tasks watcher: drain pump threads before read_diag so
failure diagnostics are not read from a still-writing SpooledTemporaryFile.

Co-authored-by: Cursor <cursoragent@cursor.com>
@Devesh36

Copy link
Copy Markdown
Collaborator Author

@greptile-apps review again

@Davidson3556

Copy link
Copy Markdown
Contributor

looks good to me

@Devesh36 Devesh36 merged commit 250f2fa into Tracer-Cloud:main Jun 12, 2026
14 checks passed
@github-actions

Copy link
Copy Markdown
Contributor

🌊 Merged. @Devesh36 is now permanently woven into git history. No take-backs. 😄


👋 Join us on Discord - OpenSRE : hang out, contribute, or hunt for features and issues. Everyone's welcome.

@muddlebee

Copy link
Copy Markdown
Collaborator

@Devesh36 did you check with /sessions also? these changes are being captured while switching btw sessions? also tasks are handled?

@Devesh36

Copy link
Copy Markdown
Collaborator Author

@Devesh36 did you check with /sessions also? these changes are being captured while switching btw sessions? also tasks are handled?

Yeah tested /sessions /new and /resume with background synthetic everything works. No prefill leak on session switch tasks/placeholder behave correctly and failed synthetic prefills the prompt when you stay on the same session.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants