Bug Report: Message pipeline has two major performance bottlenecks (auth 6.2s, fixed overhead 16s)
Environment
- OpenClaw version: 2026.4.29
- Model: MiniMax-M2.7
- Network: WiFi (GPON) — baseline 0.1-0.14s Feishu API latency
Description
During message pipeline profiling, two major bottlenecks were identified:
1. Authentication: 6.2 seconds (per message)
Every single message requires re-authentication to the LLM provider. During this 6.2s window, the model does no processing at all — it's purely waiting for auth to complete.
Impact: ~15-20% of total message time is wasted on authentication.
2. Fixed overhead: 16.4 seconds (system-prompt + core-plugin-tools)
- core-plugin-tools: 7.8s
- system-prompt: 8.6s
- Combined: 16.4s of fixed overhead that does not scale with message content
This overhead is incurred regardless of message length, complexity, or session warmth.
Impact: Even for simple hi messages, minimum ~20s+ total processing time just from auth + fixed overhead.
Observed Timings (WiFi baseline, Feishu API at 0.1-0.14s)
| Pipeline Stage |
Duration |
| auth |
6.2s |
| core-plugin-tools |
7.8s |
| system-prompt |
8.6s |
| stream-setup |
8.7s |
| model-resolution |
minimal |
| Total baseline |
~45s |
Root Cause Hypothesis
-
Auth bottleneck: Token/credential refresh happens on every request instead of being cached/reused. No connection pooling or token caching layer.
-
Fixed overhead: core-plugin-tools + system-prompt suggest per-message initialization that loads plugins and context from scratch each time, rather than maintaining warm state between messages.
Expected Behavior
-
Auth should be cached: Authentication tokens should be reused across messages. If a token is valid, don't re-authenticate. Target: < 0.5s per message for auth.
-
System prompt should be cached: For active sessions, the system prompt should be computed once and cached in memory, not rebuilt on every message. Target: < 1s for cached sessions.
-
Plugin initialization should be deferred: Core plugins should be initialized once on startup, not on every message.
Potential Impact
Current: ~45s minimum per message
Target: < 5s for simple messages (by eliminating auth cache miss + reducing fixed overhead)
Suggested Fixes
For auth (6.2s → <0.5s):
- Implement token caching with TTL-based refresh
- Use connection keep-alive for LLM API calls
- Batch auth operations where possible
For fixed overhead (16.4s → <2s):
- Implement session-level system prompt caching
- Defer plugin initialization to startup, not per-message
- Add session warm-up step that pre-computes common context
Additional Context
The 4G→WiFi migration already reduced Feishu API latency from 1.79s → 0.14s (92% improvement), but total message time didn't improve proportionally because auth (6.2s) and fixed overhead (16.4s) remain as dominant factors.
Bug Report: Message pipeline has two major performance bottlenecks (auth 6.2s, fixed overhead 16s)
Environment
Description
During message pipeline profiling, two major bottlenecks were identified:
1. Authentication: 6.2 seconds (per message)
Every single message requires re-authentication to the LLM provider. During this 6.2s window, the model does no processing at all — it's purely waiting for auth to complete.
Impact: ~15-20% of total message time is wasted on authentication.
2. Fixed overhead: 16.4 seconds (system-prompt + core-plugin-tools)
This overhead is incurred regardless of message length, complexity, or session warmth.
Impact: Even for simple hi messages, minimum ~20s+ total processing time just from auth + fixed overhead.
Observed Timings (WiFi baseline, Feishu API at 0.1-0.14s)
Root Cause Hypothesis
Auth bottleneck: Token/credential refresh happens on every request instead of being cached/reused. No connection pooling or token caching layer.
Fixed overhead: core-plugin-tools + system-prompt suggest per-message initialization that loads plugins and context from scratch each time, rather than maintaining warm state between messages.
Expected Behavior
Auth should be cached: Authentication tokens should be reused across messages. If a token is valid, don't re-authenticate. Target: < 0.5s per message for auth.
System prompt should be cached: For active sessions, the system prompt should be computed once and cached in memory, not rebuilt on every message. Target: < 1s for cached sessions.
Plugin initialization should be deferred: Core plugins should be initialized once on startup, not on every message.
Potential Impact
Current: ~45s minimum per message
Target: < 5s for simple messages (by eliminating auth cache miss + reducing fixed overhead)
Suggested Fixes
For auth (6.2s → <0.5s):
For fixed overhead (16.4s → <2s):
Additional Context
The 4G→WiFi migration already reduced Feishu API latency from 1.79s → 0.14s (92% improvement), but total message time didn't improve proportionally because auth (6.2s) and fixed overhead (16.4s) remain as dominant factors.