Telegram: raw audio binary data embedded as text/plain in session context causes prompt overflow

## Summary

When a Telegram voice message (OGG/Opus) is received, the raw binary audio data is embedded directly into the session context as `text/plain`. This causes massive token inflation, as binary data gets tokenized as hundreds of thousands of garbage tokens.

## Impact

A single 13-second voice note creates a ~440KB session entry. When tokenized, this can produce 200,000–600,000 tokens of binary garbage, exceeding Claude's 200k token context limit and causing silent delivery failures (agent gets 400 error, user sees typing indicator but never receives a response).

## Evidence

Session log showing repeated `prompt is too long` errors across a single day:

```
07:07 UTC → 501,890 tokens (max 200,000)
07:08 UTC → 482,720 tokens
08:23 UTC → 639,302 tokens
09:16 UTC → 410,635 tokens
```

The user message entry for a 13-second voice note:
- Session entry size: **448,051 bytes** (438 KB)
- Content includes: transcript (correct) + raw OGG binary embedded as `<file name="...ogg" mime="text/plain">`

## Expected Behavior

Voice messages should include only:
- The transcript text
- A file reference/path (not the binary content)

The raw audio binary should never be inlined as text in the session prompt.

## Environment

- OpenClaw version: 2026.1.30
- Node: v22.22.0
- Channel: Telegram (long-polling)
- Model: anthropic/claude-opus-4-5
- TTS config: `messages.tts.auto: "inbound"`

## Workaround

- Enable `contextPruning` with `mode: "cache-ttl"` and `hardClear.enabled: true` to trim old tool results
- Auto-compaction helps but cannot fix single user messages that exceed the model limit
- Session reset (`/new`) when context is bloated

## Related

- #6068 (Telegram voice caption overflow)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Telegram: raw audio binary data embedded as text/plain in session context causes prompt overflow #6130

Summary

Impact

Evidence

Expected Behavior

Environment

Workaround

Related

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Telegram: raw audio binary data embedded as text/plain in session context causes prompt overflow #6130

Description

Summary

Impact

Evidence

Expected Behavior

Environment

Workaround

Related

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions