feat(lifecycle): inbound turn tracking, orphan recovery, and abort coordination#29956
feat(lifecycle): inbound turn tracking, orphan recovery, and abort coordination#29956nohat wants to merge 16 commits intoopenclaw:mainfrom
Conversation
Replace unbounded file-based delivery queue with queryable SQLite message_outbox table. Adds TTL/expiry for stale entries, delivery outcome retention, and one-time legacy file queue import on startup. Closes openclaw#23777, openclaw#16555, openclaw#29128
…pat layer Write-ahead delivery pattern: enqueue outbox entry before sending, ack on success, retry on failure. Continuous outbox worker replaces one-shot recovery. Plugin channels get durable delivery guarantees via v1/v2 adapter compat layer. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…low v2 sendFinal-only plugin adapters (Codex P1+P2 openclaw#29148)
…l and separate ackDelivery errors in recovery
Every inbound message creates a durable turn record in message_turns. Turn worker detects orphaned turns (accepted but never completed after crash) and recovers them. Abort commands mark turns as aborted, preventing re-delivery. Outbox entries are linked to turns for coordinated finalization. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…d premature finalization (Codex P1+P2 openclaw#29149)
…-open on delivery stats
|
Closing this PR because it looks dirty (too many unrelated or unexpected changes). This usually happens when a branch picks up unrelated commits or a merge went sideways. Please recreate the PR from a clean branch. |
2 similar comments
|
Closing this PR because it looks dirty (too many unrelated or unexpected changes). This usually happens when a branch picks up unrelated commits or a merge went sideways. Please recreate the PR from a clean branch. |
|
Closing this PR because it looks dirty (too many unrelated or unexpected changes). This usually happens when a branch picks up unrelated commits or a merge went sideways. Please recreate the PR from a clean branch. |
|
Closing this PR because it looks dirty (too many unrelated or unexpected changes). This usually happens when a branch picks up unrelated commits or a merge went sideways. Please recreate the PR from a clean branch. |
19 similar comments
|
Closing this PR because it looks dirty (too many unrelated or unexpected changes). This usually happens when a branch picks up unrelated commits or a merge went sideways. Please recreate the PR from a clean branch. |
|
Closing this PR because it looks dirty (too many unrelated or unexpected changes). This usually happens when a branch picks up unrelated commits or a merge went sideways. Please recreate the PR from a clean branch. |
|
Closing this PR because it looks dirty (too many unrelated or unexpected changes). This usually happens when a branch picks up unrelated commits or a merge went sideways. Please recreate the PR from a clean branch. |
|
Closing this PR because it looks dirty (too many unrelated or unexpected changes). This usually happens when a branch picks up unrelated commits or a merge went sideways. Please recreate the PR from a clean branch. |
|
Closing this PR because it looks dirty (too many unrelated or unexpected changes). This usually happens when a branch picks up unrelated commits or a merge went sideways. Please recreate the PR from a clean branch. |
|
Closing this PR because it looks dirty (too many unrelated or unexpected changes). This usually happens when a branch picks up unrelated commits or a merge went sideways. Please recreate the PR from a clean branch. |
|
Closing this PR because it looks dirty (too many unrelated or unexpected changes). This usually happens when a branch picks up unrelated commits or a merge went sideways. Please recreate the PR from a clean branch. |
|
Closing this PR because it looks dirty (too many unrelated or unexpected changes). This usually happens when a branch picks up unrelated commits or a merge went sideways. Please recreate the PR from a clean branch. |
|
Closing this PR because it looks dirty (too many unrelated or unexpected changes). This usually happens when a branch picks up unrelated commits or a merge went sideways. Please recreate the PR from a clean branch. |
|
Closing this PR because it looks dirty (too many unrelated or unexpected changes). This usually happens when a branch picks up unrelated commits or a merge went sideways. Please recreate the PR from a clean branch. |
|
Closing this PR because it looks dirty (too many unrelated or unexpected changes). This usually happens when a branch picks up unrelated commits or a merge went sideways. Please recreate the PR from a clean branch. |
|
Closing this PR because it looks dirty (too many unrelated or unexpected changes). This usually happens when a branch picks up unrelated commits or a merge went sideways. Please recreate the PR from a clean branch. |
|
Closing this PR because it looks dirty (too many unrelated or unexpected changes). This usually happens when a branch picks up unrelated commits or a merge went sideways. Please recreate the PR from a clean branch. |
|
Closing this PR because it looks dirty (too many unrelated or unexpected changes). This usually happens when a branch picks up unrelated commits or a merge went sideways. Please recreate the PR from a clean branch. |
|
Closing this PR because it looks dirty (too many unrelated or unexpected changes). This usually happens when a branch picks up unrelated commits or a merge went sideways. Please recreate the PR from a clean branch. |
|
Closing this PR because it looks dirty (too many unrelated or unexpected changes). This usually happens when a branch picks up unrelated commits or a merge went sideways. Please recreate the PR from a clean branch. |
|
Closing this PR because it looks dirty (too many unrelated or unexpected changes). This usually happens when a branch picks up unrelated commits or a merge went sideways. Please recreate the PR from a clean branch. |
|
Closing this PR because it looks dirty (too many unrelated or unexpected changes). This usually happens when a branch picks up unrelated commits or a merge went sideways. Please recreate the PR from a clean branch. |
|
Closing this PR because it looks dirty (too many unrelated or unexpected changes). This usually happens when a branch picks up unrelated commits or a merge went sideways. Please recreate the PR from a clean branch. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: e5e4c0de5b
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| SET status='failed_terminal', | ||
| error_class='terminal', | ||
| terminal_reason='non_final_recovery_skip', | ||
| completed_at=? |
There was a problem hiding this comment.
Avoid terminalizing skipped non-final recovery entries
recoverPendingDeliveries marks queued tool/block rows as failed_terminal, but the turn worker later treats any failed outbox row as a hard turn failure (outbox.failed > 0 in runTurnPass) instead of replaying the turn. In a crash where a non-final row is persisted but the final reply was never sent, this causes the whole turn to be finalized as failed and the user never gets the final response.
Useful? React with 👍 / 👎.
| FROM message_outbox | ||
| WHERE status IN ('queued', 'failed_retryable') | ||
| AND next_attempt_at <= ? | ||
| AND (queued_at < ? OR last_attempt_at IS NOT NULL OR attempt_count > 0) |
There was a problem hiding this comment.
Recover fresh queued rows when direct-path ack bookkeeping fails
The startup-cutoff filter excludes rows enqueued after startup unless they already have retry metadata, but ackDelivery logs and swallows DB update errors; if that update fails, a successfully sent row can remain queued with attempt_count=0 and last_attempt_at=NULL forever. In that state this predicate keeps skipping the row on every worker pass, so the outbox/turn state never converges until stale-turn cleanup.
Useful? React with 👍 / 👎.
Greptile SummaryThis PR adds comprehensive durable turn tracking and orphan recovery infrastructure to prevent message loss after gateway crashes. The implementation migrates from file-based to SQLite-backed queuing and introduces coordinated turn lifecycle management. Key changes:
Design strengths:
Test coverage: Tests updated for new dispatch behavior, outbox integration, and delivery stats tracking Confidence Score: 4/5
Last reviewed commit: e5e4c0d |
Summary
Stack: merge after #29953 (the diff will be correct once #29953 is merged; until then this shows the combined diff)
Adds durable inbound turn tracking so the gateway can detect and recover orphaned turns (e.g., after a crash mid-stream).
Key differences from old #29149:
minAgeMsparameter removed fromlistRecoverableTurnsCarries forward:
Closes #26764, #29124, #29125, #29127
Related: #28941
Test plan
pnpm buildpassespnpm testpasses (turn tracking, recovery, dedup tests)🤖 Generated with Claude Code