Bug: Telegram 'typing' indicator persists due to unhandled API rate limit failover loop

## Bug Description
The Telegram `sendChatAction('typing')` indicator sometimes persists indefinitely after the agent has sent its final reply. This makes the assistant appear to be stuck or in an infinite loop from the user's perspective.

## Root Cause Analysis
Based on live diagnostics, this issue is not caused by a simple hanging background process. The root cause is an unhandled state transition within the agent session manager when encountering API rate limit errors from the LLM provider.

**Log Evidence:**
The logs clearly show a cascade of `FailoverError: ⚠️ API rate limit reached. Please try again later.` and `FailoverError: No available auth profile for ... (all in cooldown or unavailable).` errors.

Sample log entries:
```json
{"subsystem":"agent/embedded","1":"embedded run agent end: runId=... isError=true error=\"⚠️ API rate limit reached. Please try again later.\""}
{"subsystem":"diagnostic","1":"lane task error: lane=main durationMs=... error=\"FailoverError: ⚠️ API rate limit reached. Please try again later.\""}
```

**Mechanism:**
1. An agent makes a call to an LLM provider (e.g., `google-gemini-cli`).
2. The provider returns a rate limit error.
3. The OpenClaw gateway enters a failover/retry loop, attempting to use backup models (`qwen-portal`, etc.).
4. This retry loop does not seem to terminate the parent session cleanly. The session remains in an 'active' or 'running' state internally.
5. Because the session is never marked as 'finished', the `TypingManager` (or equivalent) for the Telegram channel never receives the signal to stop sending `sendChatAction`.
6. The 'typing' indicator remains stuck until the gateway is manually restarted, which clears the hung session.

## Steps to Reproduce
The bug is intermittent and hard to reproduce on demand as it requires triggering a real API rate limit.

1. Configure multiple LLM providers for failover.
2. Perform actions that rapidly consume API tokens (e.g., many parallel sub-agents, frequent complex cron jobs) to trigger a rate limit error from the primary provider.
3. Observe the Telegram chat.
4. When the agent fails to respond due to the rate limit, the 'typing' indicator may get stuck.

## Expected Behavior
When an agent run fails due to a terminal error like rate limiting (after all retries are exhausted), the session should be cleanly marked as 'error' or 'finished', and the `sendChatAction('typing')` loop for that turn must be terminated.

## Actual Behavior
The session appears to hang in a retry loop, preventing the typing indicator from being cleared.

## Impact
- Severely degraded user experience.
- Makes the assistant appear unreliable and broken.
- May lead to resource leaks if many sessions get stuck in this state.

*This issue was diagnosed and submitted by an OpenClaw agent on behalf of a user.*


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Bug: Telegram 'typing' indicator persists due to unhandled API rate limit failover loop #27360

Bug Description

Root Cause Analysis

Steps to Reproduce

Expected Behavior

Actual Behavior

Impact

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Bug: Telegram 'typing' indicator persists due to unhandled API rate limit failover loop #27360

Description

Bug Description

Root Cause Analysis

Steps to Reproduce

Expected Behavior

Actual Behavior

Impact

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions