Skip to content

Message pipeline has two major performance bottlenecks (auth 6.2s, fixed overhead 16s) #76107

@hongfangsong

Description

@hongfangsong

Bug Report: Message pipeline has two major performance bottlenecks (auth 6.2s, fixed overhead 16s)

Environment

  • OpenClaw version: 2026.4.29
  • Model: MiniMax-M2.7
  • Network: WiFi (GPON) — baseline 0.1-0.14s Feishu API latency

Description

During message pipeline profiling, two major bottlenecks were identified:

1. Authentication: 6.2 seconds (per message)

Every single message requires re-authentication to the LLM provider. During this 6.2s window, the model does no processing at all — it's purely waiting for auth to complete.

Impact: ~15-20% of total message time is wasted on authentication.

2. Fixed overhead: 16.4 seconds (system-prompt + core-plugin-tools)

  • core-plugin-tools: 7.8s
  • system-prompt: 8.6s
  • Combined: 16.4s of fixed overhead that does not scale with message content

This overhead is incurred regardless of message length, complexity, or session warmth.

Impact: Even for simple hi messages, minimum ~20s+ total processing time just from auth + fixed overhead.

Observed Timings (WiFi baseline, Feishu API at 0.1-0.14s)

Pipeline Stage Duration
auth 6.2s
core-plugin-tools 7.8s
system-prompt 8.6s
stream-setup 8.7s
model-resolution minimal
Total baseline ~45s

Root Cause Hypothesis

  1. Auth bottleneck: Token/credential refresh happens on every request instead of being cached/reused. No connection pooling or token caching layer.

  2. Fixed overhead: core-plugin-tools + system-prompt suggest per-message initialization that loads plugins and context from scratch each time, rather than maintaining warm state between messages.

Expected Behavior

  1. Auth should be cached: Authentication tokens should be reused across messages. If a token is valid, don't re-authenticate. Target: < 0.5s per message for auth.

  2. System prompt should be cached: For active sessions, the system prompt should be computed once and cached in memory, not rebuilt on every message. Target: < 1s for cached sessions.

  3. Plugin initialization should be deferred: Core plugins should be initialized once on startup, not on every message.

Potential Impact

Current: ~45s minimum per message
Target: < 5s for simple messages (by eliminating auth cache miss + reducing fixed overhead)

Suggested Fixes

For auth (6.2s → <0.5s):

  • Implement token caching with TTL-based refresh
  • Use connection keep-alive for LLM API calls
  • Batch auth operations where possible

For fixed overhead (16.4s → <2s):

  • Implement session-level system prompt caching
  • Defer plugin initialization to startup, not per-message
  • Add session warm-up step that pre-computes common context

Additional Context

The 4G→WiFi migration already reduced Feishu API latency from 1.79s → 0.14s (92% improvement), but total message time didn't improve proportionally because auth (6.2s) and fixed overhead (16.4s) remain as dominant factors.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions