Summary
OpenClaw already provides call-level tool hooks such as before_tool_call / after_tool_call, but model-side hooks appear to be higher-level. In practice, llm_input / llm_output seem to represent an overall prompt attempt rather than each real provider model call. For tool-using turns, this makes it hard to trace the actual execution sequence and payloads. It would be very helpful to add non-breaking call-level model hooks, such as before_model_call / after_model_call, so each real model invocation can be observed individually.
Problem to solve
A single agent turn may involve multiple real model calls:
- one model call to decide which tools to use
- one or more tool calls
- another model call to produce the final answer
Today, tool calls can be observed individually, but model calls cannot. As a result:
- multiple real model calls may be collapsed into one higher-level event
- tool-loop sequencing is hard to reconstruct accurately
- observed model input/output may differ from the actual provider-facing request/response
- it is hard to distinguish “tool-selection” model calls from “final-answer” model calls
This makes accurate tracing, debugging, and observability integrations difficult.
Proposed solution
Add a new non-breaking pair of model call lifecycle hooks:
before_model_call
after_model_call
These would complement the existing:
before_tool_call
after_tool_call
and should ideally fire at the boundary where the real provider request is assembled and sent.
Suggested fields:
runId
sessionId
provider
model
api
callId
requestPayload
responsePayload
error
durationMs
The key goal is for each event to correspond to one real model invocation, not one overall prompt attempt.
Alternatives considered
1. Reconstruct real model calls from llm_input / llm_output
This is not reliable because final provider payloads may depend on transcript sanitation, turn validation, provider-specific formatting, and retry/repair logic.
2. Keep using only llm_input / llm_output
This is sufficient for coarse observability, but not for accurate tracing of multi-call tool loops.
3. Change the semantics of existing llm_input / llm_output
This seems riskier for backward compatibility and would make the current hook semantics less clear.
Impact
This would improve:
- tracing of multi-step tool loops
- provider-payload-level debugging
- observability / telemetry integrations
- failure investigation and replay
- consistency between model-call and tool-call lifecycle tracing
It would also provide a cleaner and more symmetric observability model: tool calls already have call-level hooks, and model calls would have them too.
Evidence/examples
No response
Additional information
No response
Summary
OpenClaw already provides call-level tool hooks such as
before_tool_call/after_tool_call, but model-side hooks appear to be higher-level. In practice,llm_input/llm_outputseem to represent an overall prompt attempt rather than each real provider model call. For tool-using turns, this makes it hard to trace the actual execution sequence and payloads. It would be very helpful to add non-breaking call-level model hooks, such asbefore_model_call/after_model_call, so each real model invocation can be observed individually.Problem to solve
A single agent turn may involve multiple real model calls:
Today, tool calls can be observed individually, but model calls cannot. As a result:
This makes accurate tracing, debugging, and observability integrations difficult.
Proposed solution
Add a new non-breaking pair of model call lifecycle hooks:
before_model_callafter_model_callThese would complement the existing:
before_tool_callafter_tool_calland should ideally fire at the boundary where the real provider request is assembled and sent.
Suggested fields:
runIdsessionIdprovidermodelapicallIdrequestPayloadresponsePayloaderrordurationMsThe key goal is for each event to correspond to one real model invocation, not one overall prompt attempt.
Alternatives considered
1. Reconstruct real model calls from
llm_input/llm_outputThis is not reliable because final provider payloads may depend on transcript sanitation, turn validation, provider-specific formatting, and retry/repair logic.
2. Keep using only
llm_input/llm_outputThis is sufficient for coarse observability, but not for accurate tracing of multi-call tool loops.
3. Change the semantics of existing
llm_input/llm_outputThis seems riskier for backward compatibility and would make the current hook semantics less clear.
Impact
This would improve:
It would also provide a cleaner and more symmetric observability model: tool calls already have call-level hooks, and model calls would have them too.
Evidence/examples
No response
Additional information
No response