Fix task completion detection to require explicit completion signal by subsy · Pull Request #268 · subsy/ralph-tui

subsy · 2026-02-03T22:35:00Z

Summary

Fixed a bug where tasks were incorrectly marked as completed based solely on exit code 0, rather than requiring an explicit completion signal from the agent.

Changes

Modified task completion logic in ExecutionEngine to only recognize the explicit <promise>COMPLETE</promise> signal as indicating task completion
Removed reliance on agentResult.status === 'completed' as a completion indicator
Added clarifying comments explaining that exit code 0 alone does not indicate task completion, as agents may exit cleanly after asking clarification questions or encountering blockers

Details

Previously, the engine would consider a task complete if either:

The agent output contained <promise>COMPLETE</promise>, OR
The agent exited with status code 0

This caused false positives where agents that exited cleanly (status 0) but were still waiting for user input or had encountered a blocker would be incorrectly marked as complete.

Now, only the explicit <promise>COMPLETE</promise> signal is recognized as a valid completion indicator, ensuring tasks are only marked complete when the agent explicitly indicates completion.

Fixes #259

https://claude.ai/code/session_019UgvbDYXAw19kEShx3c5PC

Summary by CodeRabbit

Chores
- Package version updated to 0.7.0
Bug Fixes
- Task completion is now recognised only via explicit completion signals in agent output, improving accuracy of task tracking and iteration status.
Tests
- Added comprehensive tests covering completion detection across varied outputs and edge cases.
New Features
- Added support and tests for a new built‑in template type (beads-rust).

Tasks were being marked complete when agents exited with code 0, even if no actual work was done. This caused issues when agents asked clarification questions and then exited cleanly without receiving answers. Now only the explicit <promise>COMPLETE</promise> signal marks a task as complete. Exit code 0 alone is not sufficient, as it just means the process ended normally - not that the task was actually finished. Fixes #259 https://claude.ai/code/session_019UgvbDYXAw19kEShx3c5PC

…sue-259-zrJBc

vercel · 2026-02-03T22:35:05Z

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment

Project	Deployment	Actions	Updated (UTC)
ralph-tui	Ignored	Preview	Feb 3, 2026 10:44pm

coderabbitai · 2026-02-03T22:35:16Z

Walkthrough

Version bumped to 0.7.0. Task completion detection was changed to require an explicit COMPLETE tag in agent stdout; completion is no longer inferred from the agent's exit status. New tests and a template export were added.

Changes

Cohort / File(s)	Summary
Version Update `package.json`	Bumped package version from `0.6.1` to `0.7.0`.
Task Completion Logic `src/engine/index.ts`	Changed completion detection: require explicit `<promise>COMPLETE</promise>` in stdout; removed fallback that treated `status === 'completed'` as completion.
Completion Detection Tests `src/engine/completion-detection.test.ts`	Added comprehensive test suite validating the PROMISE_COMPLETE_PATTERN and many edge cases for completion detection, ensuring only explicit signals mark tasks complete.
Template tests / export `tests/templates/engine.test.ts`, `src/templates/builtin.*`	Added tests for new built-in template `beads-rust` and exported `BEADS_RUST_TEMPLATE` used by those tests.

Sequence Diagram(s)

sequenceDiagram
    participant Agent as Agent
    participant Engine as Engine
    participant Tracker as Tracker
    participant VCS as VCS
    Agent->>Engine: stdout (may include <promise>COMPLETE</promise>)
    alt stdout contains explicit <promise>COMPLETE</promise>
        Engine->>Tracker: mark task COMPLETE (update metadata)
        Engine->>VCS: commit metadata + outputs
    else no explicit completion signal
        Engine->>Tracker: mark task INCOMPLETE (record status/notes)
        Engine->>VCS: commit metadata-only (if any)
    end

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Embed tracker templates to fix bundled environment path resolution #267: Adds/exports BEADS_RUST_TEMPLATE and updates template resolution paths; closely related to the template export and tests in this change.

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.
Out of Scope Changes check	❓ Inconclusive	The PR includes version bump, core completion detection fix, comprehensive tests, and a new BEADS_RUST_TEMPLATE export unrelated to issue `#259`.	Clarify whether the BEADS_RUST_TEMPLATE export and version bump are intentional scope changes or should be separated into a different PR.

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The PR title clearly and specifically describes the main change: fixing task completion detection to require an explicit completion signal.
Linked Issues check	✅ Passed	The PR addresses issue `#259` by modifying completion logic to only accept explicit COMPLETE signals, removing reliance on exit status code 0, which aligns with the desired objective of marking tasks complete only when agents explicitly indicate completion.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch claude/investigate-issue-259-zrJBc

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

codecov · 2026-02-03T22:36:17Z

Codecov Report

❌ Patch coverage is 20.00000% with 4 lines in your changes missing coverage. Please review.
✅ Project coverage is 43.68%. Comparing base (ef3ace0) to head (799ef35).
⚠️ Report is 5 commits behind head on main.

Files with missing lines	Patch %	Lines
src/engine/index.ts	20.00%	4 Missing ⚠️

❌ Your patch status has failed because the patch coverage (20.00%) is below the target coverage (50.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #268      +/-   ##
==========================================
- Coverage   43.69%   43.68%   -0.01%     
==========================================
  Files          94       94              
  Lines       29009    29012       +3     
==========================================
- Hits        12676    12675       -1     
- Misses      16333    16337       +4

Files with missing lines	Coverage Δ
src/engine/index.ts	`48.55% <20.00%> (-0.15%)`	⬇️

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

https://claude.ai/code/session_019UgvbDYXAw19kEShx3c5PC

- Add completion-detection.test.ts with 24 tests covering: - PROMISE_COMPLETE_PATTERN matching behavior - Task completion logic (only explicit signal marks complete) - Edge cases from issue #259 (metadata-only updates, looping) - Add tests for BEADS_RUST_TEMPLATE in template engine tests: - getBuiltinTemplate returns correct template - getTemplateTypeFromPlugin maps beads-rust correctly - getTemplateFilename returns beads-rust.hbs - loadTemplate loads beads-rust tracker template https://claude.ai/code/session_019UgvbDYXAw19kEShx3c5PC

…zrJBc Fix task completion detection to require explicit completion signal

claude added 2 commits February 3, 2026 16:17

Merge remote-tracking branch 'origin/main' into claude/investigate-is…

2adb798

…sue-259-zrJBc

claude added 2 commits February 3, 2026 22:36

chore: bump version to 0.7.0

443ba7a

https://claude.ai/code/session_019UgvbDYXAw19kEShx3c5PC

subsy merged commit 9c18a38 into main Feb 3, 2026
8 of 9 checks passed

subsy deleted the claude/investigate-issue-259-zrJBc branch February 3, 2026 22:52

sakaman pushed a commit to sakaman/ralph-tui that referenced this pull request Feb 15, 2026

Merge pull request subsy#268 from subsy/claude/investigate-issue-259-…

845e1f6

…zrJBc Fix task completion detection to require explicit completion signal

coderabbitai bot mentioned this pull request Feb 23, 2026

feat(engine): add verification, cost tracking, and smarter completion #331

Open

7 tasks

coderabbitai bot mentioned this pull request Mar 17, 2026

Auto-refresh TUI task list after task completion #361

Open

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix task completion detection to require explicit completion signal#268

Fix task completion detection to require explicit completion signal#268
subsy merged 4 commits intomainfrom
claude/investigate-issue-259-zrJBc

subsy commented Feb 3, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

vercel bot commented Feb 3, 2026 •

edited

Loading

Uh oh!

coderabbitai bot commented Feb 3, 2026 •

edited

Loading

Uh oh!

codecov bot commented Feb 3, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

subsy commented Feb 3, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Details

Summary by CodeRabbit

Uh oh!

vercel bot commented Feb 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai bot commented Feb 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Uh oh!

codecov bot commented Feb 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

subsy commented Feb 3, 2026 •

edited by coderabbitai bot

Loading

vercel bot commented Feb 3, 2026 •

edited

Loading

coderabbitai bot commented Feb 3, 2026 •

edited

Loading

codecov bot commented Feb 3, 2026 •

edited

Loading