Recurring execution stall: assistant confirms task but performs no actions

## Summary
Recurring operational failure (~1-2 times per month): assistant confirms a requested action, but no execution starts (no tool calls, no artifacts), then later reports "no progress".

This is an execution-state bug (false "started"), not a model-quality complaint.

## Environment
- OpenClaw: `2026.3.7`
- Agent/session: `agent:main:main`
- Model path in this case: `openai-codex/gpt-5.3-codex`
- OS: Windows

## Reproduction (from real session log, detailed timeline)
Session file:
`C:\Users\karte\.openclaw\agents\main\sessions\859c8fac-93a6-41f2-af04-3b57519d76a4.jsonl`

Timeline (UTC from JSONL):
1. `2026-03-09T05:00:18Z` assistant says "Проверю сейчас..." for a quick check request.
2. `2026-03-09T05:07:50Z` user asks: `проведи глубокий ауди всех свои файлов`.
3. `2026-03-09T05:07:54Z` assistant replies: "Принял. Проведу глубокий аудит..."
4. Between `05:07:54Z` and `05:18:52Z` there is no execution evidence for that accepted audit action:
   - no `toolCall`
   - no `toolResult`
   - no artifact path
   - no run id / process id / state update
5. `2026-03-09T05:18:52Z` assistant reports: "Нет, валидного прогресса нет."

Representative message ids in this segment:
- accepted-action reply: `72f8c694`
- no-progress reply: `ae45a24d`

## Expected behavior
- If execution cannot start for any reason, assistant should fail fast and return `blocked/error` immediately.
- It should not acknowledge "started/doing" unless execution actually started.
- Runtime state transition should be deterministic (`accepted -> running` only after real action event).

## Actual behavior
- Assistant acknowledges execution in natural language.
- No action starts.
- User needs manual polling; later assistant admits no progress.

## Why this matters
This creates hidden idle periods and breaks orchestration trust: chat state diverges from runtime truth.

## Suggested fix direction
1. Enforce execution gate: block "started/doing" replies unless there is a fresh artifact (`toolCall`, process start, `runId`, state file update).
2. Add watchdog: if no action event appears within N seconds after acceptance, auto-convert to `blocked/error` with concrete reason.
3. Expose this clearly in telemetry/state machine so UI/chat cannot drift from runtime.

I can provide additional traces from prior dates with the same pattern if needed.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Recurring execution stall: assistant confirms task but performs no actions #40631

Summary

Environment

Reproduction (from real session log, detailed timeline)

Expected behavior

Actual behavior

Why this matters

Suggested fix direction

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Recurring execution stall: assistant confirms task but performs no actions #40631

Description

Summary

Environment

Reproduction (from real session log, detailed timeline)

Expected behavior

Actual behavior

Why this matters

Suggested fix direction

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions