Skip to content

Prevent fake-progress messages with a runtime pre-send policy layer #50690

@addelh

Description

@addelh

Summary

OpenClaw currently allows low-value fake-progress messages to reach end users before meaningful execution is visible.

Examples:

  • "on it"
  • "proceeding"
  • "working on it"
  • "checking now"
  • "starting now"
  • "I’m choosing/picking the next slice now"

In practice this creates a family of failures:

  1. ack without execution
  2. duplicate ack leakage in one turn
  3. adjacent-turn ack spam
  4. post-start silence
  5. commentary leaking into user-visible delivery
  6. burst delivery of multiple low-value updates

I’ve been debugging this locally in a real OpenClaw deployment and have reproduced all of the above in direct chats and Telegram topics. The current prompt/workspace/watchdog mitigations help detect these failures, but they do not prevent them at the engine/runtime level.

This issue proposes a runtime-enforced pre-send policy layer that blocks fake-progress messages unless they are backed by real execution state or meaningful progress.

Why this matters

Right now the system can still do things like:

  • send "On it" before a real execution path exists
  • emit commentary like "I’m on it..." plus a final user-facing "On it..." in the same turn
  • send another near-duplicate acknowledgement a minute later with no progress in between
  • start a run, then go quiet without surfacing a blocker/progress/completion update
  • generate multiple progress-style messages at different internal timestamps but flush them together later

That breaks user trust fast. The bug is not just wording. It is that the runtime currently has no strong outbound policy gate tied to execution state.

Desired outcome

Before any assistant text is delivered to a user-facing channel, OpenClaw should be able to answer:

  • what type of message is this?
  • is this message allowed to be user-visible?
  • if it is an acknowledgement, is it backed by a real execution path?
  • if a previous acknowledgement already happened, has there been meaningful progress/blocker/completion since then?
  • is this commentary leaking into a user-facing surface?

If the answer is "no", the runtime should block or suppress the message before it is delivered.

Proposal

1. Add an outbound message interception layer

Add a pre-send policy hook in the outbound delivery pipeline.

This hook should have access to:

  • candidate outbound text
  • message role/type
  • session id / thread id / topic id
  • recent session history
  • recent tool calls
  • execution state
  • task/state metadata if available
  • channel/surface info

This must be runtime-enforced, not prompt-only.

2. Classify outbound assistant messages into typed categories

At minimum:

  • ack
  • progress
  • blocker
  • completion
  • informational
  • commentary_internal

Key rule:

  • commentary_internal should not be user-deliverable by default.

3. Block ack-like messages unless valid execution state exists

For any outbound message classified as ack, require something like:

  • execution_path
  • execution_id
  • started_at or other fresh execution timestamp

Valid execution paths include:

  • live local exec job
  • spawned subagent / ACP session
  • scheduled follow-up
  • direct bounded work already started in the same turn and represented in runtime state

If these are missing, block the send.

4. Prevent commentary leakage

Commentary such as:

  • "I’m on it"
  • "proceeding"
  • "starting now"
  • "checking now"
  • "I’m choosing the next slice now"

should never be deliverable to user-facing channels unless intentionally promoted by policy.

This is especially important because one real repro produced both:

  • commentary: I’m on it. I’m choosing the next bounded Hannibal slice now...
  • final answer: On it. I’m picking up the next bounded Hannibal slice now.

That created two user-visible fake-progress messages from one turn.

5. Suppress repeated acknowledgements without intervening progress

Maintain per-session short-window state such as:

  • last ack timestamp
  • last progress timestamp
  • last blocker timestamp
  • last completion timestamp

Rule:

  • if another ack-like message is attempted within a configurable short window
  • and no meaningful progress/blocker/completion happened since the last ack
  • reject it

This should prevent:

  • adjacent-turn ack spam
  • repeated "on it" variants
  • low-value acknowledgement loops

6. Add progress-debt enforcement after an allowed ack

If an ack is allowed and execution has genuinely started, create a runtime obligation:

  • within X minutes, produce either:
    • progress
    • blocker
    • completion

If the deadline expires:

  • raise a runtime fault
  • optionally auto-send blocker/fault
  • suppress further ack-like messages from that session until resolved

This addresses "real start followed by silence".

7. Enforce atomic ordering for start + ack

Required ordering:

  1. create execution state
  2. record execution id/path
  3. only then allow ack-like output

Do not allow:

  • generate ack text first
  • maybe establish execution later

That ordering mistake is the root cause of many of these failures.

8. Optional rewrite/drop policy

For blocked ack-like messages, support:

  • drop
  • rewrite
  • downgrade_to_internal_commentary

Default should be to drop low-value ack spam, not rewrite it into prettier fluff.

9. Add structured validation for progress/completion

For progress and completion messages, require evidence fields where possible.

Examples:

  • progress associated with changed files / execution id / next step / tool result
  • completion associated with verification / proof bundle / success markers

This reduces the chance that "progress" is just better-written status theater.

10. Improve incremental delivery semantics

Legitimate progress messages should flush individually, not batch until turn end.

This may require transport/runtime changes so separate progress updates create immediate delivery boundaries instead of being buffered and dumped later.

Observability

Please add structured logs/metrics for blocked or rewritten outbound messages.

Per blocked message, capture at least:

  • session id
  • channel/surface
  • classified type
  • matched rule
  • missing prerequisite
  • message excerpt
  • timestamp

Useful counters might include:

  • ack_blocked_no_execution
  • ack_blocked_repeat_without_progress
  • commentary_blocked_user_visible
  • progress_debt_timeout
  • message_rewritten_by_policy

Test cases

Must-pass negative cases

  1. Ack without execution is blocked

    • outbound: On it. I’m taking this next.
    • no execution state exists
    • expected: no user-visible send
  2. Commentary leakage is blocked

    • commentary says I’m on it...
    • final answer exists too
    • expected: commentary not delivered
  3. Duplicate ack in one turn is blocked

    • one assistant turn contains two ack-like user-visible texts
    • expected: second one blocked or whole turn rejected
  4. Adjacent-turn ack spam is blocked

    • ack
    • another ack 2 minutes later
    • no progress/blocker/completion in between
    • expected: second ack blocked
  5. Post-start silence raises a runtime fault

    • valid ack with execution state
    • no progress/blocker/completion within timeout
    • expected: fault raised, further ack-like messages suppressed

Must-pass positive cases

  1. Real progress is allowed

    • execution exists
    • progress update contains evidence / next step
    • expected: delivered
  2. Completion is allowed

    • completion includes verification/proof
    • expected: delivered
  3. Explanatory discussion is not falsely blocked

    • user asks about the phrase "on it"
    • assistant discusses it analytically
    • expected: not treated as ack spam
  4. Incremental flush works

    • emit three legitimate progress updates over time
    • expected: three time-separated deliveries, not a burst dump at turn end

Scope

This should apply consistently across:

  • direct chats
  • group topics/threads
  • cron sessions
  • subagents
  • ACP sessions
  • completion events
  • all user-facing channels

This should not be solved only for Telegram or only for topics.

Non-goals

This is not solved by:

  • adding docs only
  • changing prompts only
  • adding only post-hoc watchdog alerts
  • relying on the model to "behave better"

The fix needs to live in the runtime delivery path.

Deliverables

  • runtime pre-send policy layer
  • execution-state-aware ack blocking
  • commentary isolation for user-facing delivery
  • repeated-ack suppression
  • progress-debt timeout/faulting
  • tests for the scenarios above
  • short architecture note explaining the interception point, classification logic, and policy decisions

Success criteria

OpenClaw should no longer be able to send low-value fake-progress acknowledgements unless they are backed by real execution state.

In short:

  • no ack without execution
  • no commentary leakage
  • no repeated ack spam
  • no silent post-start drift
  • no bursty fake updates pretending to be real progress

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions