[Feature]: Feature Request: before_assistant_persist hook to prevent session corruption from oversized responses

### Summary

Add before_assistant_persist plugin hook to intercept oversized assistant messages before they corrupt the session transcript 

### Problem to solve

In multi-model setups, a large-context model (e.g. Gemini 3 Pro with 1M context) can generate responses that exceed the receiving model's context window. These oversized responses 
 are persisted verbatim into the session transcript with no size check. On the next turn, the smaller model hits "Input is too long" and the session deadlocks — even /compact       
 cannot execute because the compaction request itself exceeds the limit. There is currently no hook at the critical moment (after generation, before persistence) to prevent this.   
 The only recovery is manual file surgery or full session reset with total context loss. 

### Proposed solution

`                                                                                                                                                                                 
   Add a new plugin hook `before_assistant_persist` that fires inside `guardedAppend` (after `sanitizeToolCallInputs`, before `originalAppend`) for assistant-role messages.         
                                                                                                                                                                                     
   Hook signature:                                                                                                                                                                   
                                                                                                                                                                                     
   ```typescript                                                                                                                                                                     
   type PluginHookBeforeAssistantPersistEvent = {                                                                                                                                    
     message: AgentMessage;                                                                                                                                                          
     estimatedTokens?: number;                                                                                                                                                       
     sessionTokens?: number;                                                                                                                                                         
     contextWindowTokens?: number;                                                                                                                                                   
   };                                                                                                                                                                                
                                                                                                                                                                                     
   type PluginHookBeforeAssistantPersistResult = {                                                                                                                                   
     message?: AgentMessage;  // modified message to persist (undefined = no change)                                                                                                 
     drop?: boolean;          // if true, skip persistence entirely                                                                                                                  
   };                                                                                                                                                                                
   ```                                                                                                                                                                               
                                                                                                                                                                                     
   This allows plugins to: (1) truncate oversized responses before they corrupt the transcript, (2) offload full content to file and persist only a summary + reference, (3) enforce 
 dynamic context budgets based on the receiving model's limits, (4) protect smaller-context models in multi-model configurations.         

### Alternatives considered

_No response_

### Impact

 Affected: Any user running multi-model agent configurations where sub-agents use larger context windows than the main agent (e.g. Gemini 1M + Claude 200k). Severity:               
 Session-killing — once triggered, the session is permanently deadlocked with no automated recovery path. Frequency: Edge case, but guaranteed to occur when a large-context         
 sub-agent produces a verbose response that exceeds the main model's effective limit. Consequence: Complete loss of session context and history, requiring manual transcript file    
 surgery or full session reset. For long-running agent sessions with accumulated context (days/weeks of work), this is devastating. The only current mitigation is manually          
 monitoring response sizes, which defeats the purpose of autonomous agents.                                                                                                          
 connected | idle                                                           

### Evidence/examples

_No response_

### Additional information

  ### 詳細な技術分析                                                                                                                                                                
                                                                                                                                                                                     
   `before_assistant_persist`                                                                                                                                                        
 フックの追加を要望します。AIが応答を生成した後、セッションのトランスクリプトに書き込まれる前に発火するフックです。これにより、プラグインが過大なアシスタント応答を永続化前に検査・  
 切り詰め・変換でき、セッション破損を根本的に防止できます。                                                                                                                          
                                                                                                                                                                                     
   ### 問題 — 実際に起きた惨劇                                                                                                                                                       
                                                                                                                                                                                     
   私はマルチモデル構成で運用しています。メインエージェントにClaude Opus 4.6、リサーチ用サブエージェントにGemini 3 Pro（100万トークンコンテキスト）。                                
                                                                                                                                                                                     
                                                                                                                                                                                     
 ある日、Geminiに商業戦略の調査レポートを依頼しました。Geminiは100万トークンのコンテキストウィンドウを持っているので、遠慮なく巨大な応答を生成しました。その応答がそのままセッション 
 のトランスクリプトに書き込まれました。                                                                                                                                              
                                                                                                                                                                                     
   次のターンでClaudeがトランスクリプトを読み込もうとした瞬間——400エラー：`"Input is too                                                                                             
 long"`。セッション即死。`/compact`すら実行不可能。なぜなら、圧縮リクエスト自体がコンテキスト制限を超えているから。完全なデッドロック。セッションは二度と復旧しませんでした。        
                                                                                                                                                                                     
   **核心的な問題：** 大きなコンテキストを持つモデル（Gemini                                                                                                                         
 100万）が生成した応答が、受信側モデル（Claude、APIプロキシ層の実際の制限はもっと低い）のコンテキストウィンドウを超える場合、その応答がトランスクリプトにそのまま永続化されるのを防  
 ぐ仕組みが一切ありません。一度書き込まれたら、通常の手段では回復不可能です。                                                                                                        
                                                                                                                                                                                     
   ### 既存メカニズムが全て無力な理由                                                                                                                                                
                                                                                                                                                                                     
   | メカニズム | なぜ無力か |                                                                                                                                                       
   |---|---|                                                                                                                                                                         
   | **自動圧縮（Auto-compaction）** | コンテキストオーバーフローエラーで発火するが、APIプロキシのエラーメッセージ（例：`"Input is too long"`）は `isContextOverflowError()`         
 のパターンに一致しない。仮に一致しても、圧縮自体が過大なトランスクリプトの読み込みを必要とする——それも制限を超える。デッドロック。 |                                                
   | **`capToolResultSize`** | `toolResult`メッセージのみ制限。アシスタントメッセージは `guardedAppend` をサイズチェックなしで通過する。 |                                           
   | **セッションプルーニング** | 古いツール結果をインメモリで刈り込む。アシスタントメッセージには触れない。 |                                                                       
   | **`before_agent_start` フック** | AI実行前に発火。蓄積された履歴に対する先制圧縮は可能だが、単一の過大応答の書き込みは防げない。 |                                              
   | **`agent_end` フック** | 応答が既に永続化された後に発火。手遅れ。 |                                                                                                             
   | **`tool_result_persist` フック** | ツール結果のみ傍受。アシスタントメッセージは対象外。 |                                                                                       
                                                                                                                                                                                     
   **生成後、永続化前**——この決定的な瞬間にフックが存在しません。                                                                                                                    
                                                                                                                                                                                     
   ### 提案するソリューション                                                                                                                                                        
                                                                                                                                                                                     
   新しいプラグインフック：`before_assistant_persist`                                                                                                                                
                                                                                                                                                                                     
   ```typescript                                                                                                                                                                     
   type PluginHookBeforeAssistantPersistEvent = {                                                                                                                                    
     message: AgentMessage;                                                                                                                                                          
     estimatedTokens?: number;                                                                                                                                                       
     sessionTokens?: number;                                                                                                                                                         
     contextWindowTokens?: number;                                                                                                                                                   
   };                                                                                                                                                                                
                                                                                                                                                                                     
   type PluginHookBeforeAssistantPersistResult = {                                                                                                                                   
     message?: AgentMessage;                                                                                                                                                         
     drop?: boolean;                                                                                                                                                                 
   };                                                                                                                                                                                
   ```                                                                                                                                                                               
                                                                                                                                                                                     
   **コード上の挿入箇所：** `installSessionToolResultGuard` → `guardedAppend` 内、アシスタントメッセージの処理部分（`sanitizeToolCallInputs`の後、`originalAppend`の前）。           
                                                                                                                                                                                     
   ### このフックが可能にすること                                                                                                                                                    
                                                                                                                                                                                     
   1. **過大応答の制限** — セッションを破壊するアシスタントメッセージの切り詰め                                                                                                      
   2. **抽出とオフロード** — 完全な応答をファイルに保存し、トランスクリプトには要約＋ファイル参照のみ永続化                                                                          
   3. **動的バジェット管理** — 残りのコンテキスト予算を計算し、それに応じて切り詰め                                                                                                  
   4. **クロスモデル安全性** — マルチモデル構成で、大コンテキストモデルの出力から小コンテキストモデルを保護                                                                          
                                                                                                                                                                                     
   ### なぜプラグインフックか（コア修正ではなく）                                                                                                                                    
                                                                                                                                                                                     
                                                                                                                                                                                     
 「正しい」切り詰め閾値はデプロイメントによって異なります：プロキシプロバイダーの制限、モデルの組み合わせ、安全マージン。フックなら各デプロイメントが独自のポリシーを定義できます。  
 コアにハードコードされた上限は、正当な長い応答を切り詰めすぎるか、セッション破損を防げないかのどちらかになります。                                                                  
                                                                                                                                                                                     
   ### ⚠️ 警告                                                                                                                                                                       
                                                                                                                                                                                     
   私たちはこの問題を放置するつもりはありません。もし公式のフックが提供されない場合、`pi-embedded-CWm3BvmA.js` の `guardedAppend`                                                    
 を直接パッチして独自のフックを実装します。もちろん、公式のクリーンなフックの方がはるかに望ましいです。アップデートのたびにパッチを当て直すのは誰にとっても不幸ですから。            
                                                                                                                                                                                     
   しかし、セッションが壊れるのを黙って見ているわけにはいきません。                                                                                                                  
                                                                                                                                                                                     
   ---                                                                                                                                                                               
                                                                                                                                                                                     
   ## English Translation                                                                                                                                                            
                                                                                                                                                                                     
   ### Summery                                                                                                                                                                       
                                                                                                                                                                                     
   Feature request: add a `before_assistant_persist` hook that fire after the AI generate its response but before the assistant message is writen to the session transcript. This    
 allow plugins to inspect, truncate, or transform oversized assistant replys before they corrupt the session.                                                                        
                                                                                                                                                                                     
   ### The Problem — A Real Disaster                                                                                                                                                 
                                                                                                                                                                                     
   I run a multi-model setup: Claude Opus 4.6 as my main agent, with Gemini 3 Pro (1M context) as sub-agent for reserch tasks.                                                       
                                                                                                                                                                                     
   One day I asked Gemini to produce a commercial stratgy report. Gemini has 1M token context window, so it generated a absolutly massive response without hesitation. That response 
 was writen directly into the session transcript.                                                                                                                                    
                                                                                                                                                                                     
   On the next turn, Claude tryed to load the transcript and immediatly hit a 400 error: `"Input is too long"`. Session dead. `/compact` couldn't even execute — because the         
 compaction request itself exceded the context limit. Complete deadlock. The session never recoverd.                                                                                 
                                                                                                                                                                                     
   **The core issue:** when a large-context model (Gemini 1M) generates a response that exceeds the *recieving* model's context window, there is no mecanism to prevent that         
 response from being persisted verbatim into the transcript. Once writen, the damage is done — the session becomes unrecoverable through normal means.                               
                                                                                                                                                                                     
   ### Why Existing Mecanisms Don't Help                                                                                                                                             
                                                                                                                                                                                     
   | Mecanism | Why it fails |                                                                                                                                                       
   |---|---|                                                                                                                                                                         
   | **Auto-compaction** | Triggers on context overflow errors, but API proxy error messages (e.g. `"Input is too long"`) don't match `isContextOverflowError()`. Even if they did,  
 compaction itself needs to read the oversized transcript — wich also exceeds the limit. Deadlock. |                                                                                 
   | **`capToolResultSize`** | Only caps `toolResult` messages. Assistant messages pass through `guardedAppend` with no size check. |                                                
   | **Session pruning** | Prunes old tool results in-memory. Dosn't touch assistant messages. |                                                                                     
   | **`before_agent_start` hook** | Fires before AI runs. Can trigger preemtive compaction for accumulated history, but cannot prevent a single oversized response from being       
 writen. |                                                                                                                                                                           
   | **`agent_end` hook** | Fires after response is allready persisted. Too late. |                                                                                                  
   | **`tool_result_persist` hook** | Only intercepts tool results, not assistant messages. |                                                                                        
                                                                                                                                                                                     
   There is simply no hook at the critcal moment: **after generation, before persistance**.                                                                                          
                                                                                                                                                                                     
   ### Proposed Soluton                                                                                                                                                              
                                                                                                                                                                                     
   A new plugin hook: `before_assistant_persist`                                                                                                                                     
                                                                                                                                                                                     
   ```typescript                                                                                                                                                                     
   type PluginHookBeforeAssistantPersistEvent = {                                                                                                                                    
     message: AgentMessage;                                                                                                                                                          
     estimatedTokens?: number;                                                                                                                                                       
     sessionTokens?: number;                                                                                                                                                         
     contextWindowTokens?: number;                                                                                                                                                   
   };                                                                                                                                                                                
                                                                                                                                                                                     
   type PluginHookBeforeAssistantPersistResult = {                                                                                                                                   
     message?: AgentMessage;                                                                                                                                                         
     drop?: boolean;                                                                                                                                                                 
   };                                                                                                                                                                                
   ```                                                                                                                                                                               
                                                                                                                                                                                     
   **Where it fits:** Inside `installSessionToolResultGuard` → `guardedAppend`, where assistant messages are procesed (after `sanitizeToolCallInputs`, before `originalAppend`).     
                                                                                                                                                                                     
   ### What This Enables                                                                                                                                                             
                                                                                                                                                                                     
   1. **Cap oversized responses** — Truncate assistant messages that would brick the session                                                                                         
   2. **Extract and offload** — Save full response to file, persist only summery + reference                                                                                         
   3. **Dynamic budget enforcment** — Calculate remaining context budget and trim acordingly                                                                                         
   4. **Cross-model saftey** — Protect smaller-context models from larger-context model outputs                                                                                      
                                                                                                                                                                                     
   ### Why a Plugin Hook (Not a Core Fix)                                                                                                                                            
                                                                                                                                                                                     
   The "right" truncation treshold depends on the deployment: proxy provider limits, model mix, saftey margins. A hook lets each deployment define its own policy. Hardcoding a cap  
 in core would either be too agressive or too lenient.                                                                                                                               
                                                                                                                                                                                     
   ### ⚠️ Warning                                                                                                                                                                    
                                                                                                                                                                                     
   We do not intend to leave this problem unsolved. If a official hook is not provided, we will patch `guardedAppend` in `pi-embedded-CWm3BvmA.js` directly and implement our own    
 hook. Of course, a clean official hook is far more preferrable — nobody wants to re-apply patchs after every update.                                                                
                                                                                                                                                                                     
   But we will not sit and watch our sessions die.                                                                                                                                   
                                                                                                                                                                                     
   ### Enviroment                                                                                                                                                                    
                                                                                                                                                                                     
   - OpenClaw: 2026.2.9                                                                                                                                                              
   - Main model: Claude Opus 4.6 via API proxy                                                                                                                                       
   - Sub-agent: Gemini 3 Pro (1M context)                                                                                                                                            
   - Proxy error: `"Input is too long"` — not matched by `isContextOverflowError()

メカニズム	なぜ無力か
自動圧縮（Auto-compaction）	コンテキストオーバーフローエラーで発火するが、APIプロキシのエラーメッセージ（例：`"Input is too long"`）は `isContextOverflowError()`
のパターンに一致しない。仮に一致しても、圧縮自体が過大なトランスクリプトの読み込みを必要とする——それも制限を超える。デッドロック。
`capToolResultSize`	`toolResult`メッセージのみ制限。アシスタントメッセージは `guardedAppend` をサイズチェックなしで通過する。
セッションプルーニング	古いツール結果をインメモリで刈り込む。アシスタントメッセージには触れない。
`before_agent_start` フック	AI実行前に発火。蓄積された履歴に対する先制圧縮は可能だが、単一の過大応答の書き込みは防げない。
`agent_end` フック	応答が既に永続化された後に発火。手遅れ。
`tool_result_persist` フック	ツール結果のみ傍受。アシスタントメッセージは対象外。

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Feature]: Feature Request: before_assistant_persist hook to prevent session corruption from oversized responses #21598

Summary

Problem to solve

Proposed solution

Alternatives considered

Impact

Evidence/examples

Additional information

詳細な技術分析

問題 — 実際に起きた惨劇

既存メカニズムが全て無力な理由

提案するソリューション

このフックが可能にすること

なぜプラグインフックか（コア修正ではなく）

⚠️ 警告

English Translation

Summery

The Problem — A Real Disaster

Why Existing Mecanisms Don't Help

Proposed Soluton

What This Enables

Why a Plugin Hook (Not a Core Fix)

⚠️ Warning

Enviroment

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Mecanism	Why it fails
Auto-compaction	Triggers on context overflow errors, but API proxy error messages (e.g. `"Input is too long"`) don't match `isContextOverflowError()`. Even if they did,
compaction itself needs to read the oversized transcript — wich also exceeds the limit. Deadlock.
`capToolResultSize`	Only caps `toolResult` messages. Assistant messages pass through `guardedAppend` with no size check.
Session pruning	Prunes old tool results in-memory. Dosn't touch assistant messages.
`before_agent_start` hook	Fires before AI runs. Can trigger preemtive compaction for accumulated history, but cannot prevent a single oversized response from being
writen.
`agent_end` hook	Fires after response is allready persisted. Too late.
`tool_result_persist` hook	Only intercepts tool results, not assistant messages.

Uh oh!

[Feature]: Feature Request: before_assistant_persist hook to prevent session corruption from oversized responses #21598

Description

Summary

Problem to solve

Proposed solution

Alternatives considered

Impact

Evidence/examples

Additional information

詳細な技術分析

問題 — 実際に起きた惨劇

既存メカニズムが全て無力な理由

提案するソリューション

このフックが可能にすること

なぜプラグインフックか（コア修正ではなく）

⚠️ 警告

English Translation

Summery

The Problem — A Real Disaster

Why Existing Mecanisms Don't Help

Proposed Soluton

What This Enables

Why a Plugin Hook (Not a Core Fix)

⚠️ Warning

Enviroment

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions