[Bug]: OpenAI-completions prompt_cache_key regression — caching worked in 2026.3.x, broken in 2026.5.x

### Bug type

Regression (worked before, now fails)

### Beta release blocker

No

### Summary

prompt_cache_key is no longer emitted on openai-completions requests in 2026.5.x; caching worked correctly on 2026.3.2 (confirmed on same hardware with same config).

### Steps to reproduce

1. Deploy OpenClaw with an openai-completions provider configured with compat.supportsPromptCacheKey: true and cacheRetention: "long".
2. Send repeated identical prompts through OpenClaw → oMLX (or any completions backend with prefix caching).
3. Observe cached_tokens in backend responses — always 0 on 2026.5.x.
4. Downgrade to 2026.3.2 with identical config and hardware.
5. Repeat same prompts — cached_tokens correctly populate from request 2 onward.

### Expected behavior

As observed in 2026.3.2: outgoing openai-completions requests include prompt_cache_key, and the backend reports cached_tokens > 0 on repeated requests with identical prefixes.

### Actual behavior

On 2026.5.x: prompt_cache_key is absent from outgoing requests. Backend reports cached_tokens: 0 on every request. Downgrading to 2026.3.2 restores caching immediately with no config changes.

### OpenClaw version

2026.5.7

### Operating system

Linux (Docker container/portainer on UGREEN NAS)

### Install method

docker

### Model

omlx/local_model (Qwen3.6-35B-A3B-RotorQuant-MLX-4bit via oMLX)

### Provider / routing chain

openclaw → oMLX (openai-completions, http://cerebro-mac:8080/v1)

### Additional provider/model setup details

Provider api: openai-completions
compat.supportsPromptCacheKey: true
cacheRetention: "long" (set at defaults, model, and per-model override levels)
contextInjection: "continuation-skip"
API keys and URLs redacted.

Same config file used on both 2026.3.2 (working) and 2026.5.7 (broken).
oMLX backend confirmed working: direct repeated requests to the same endpoint 
produce cached_tokens > 0 from request 2 onward, bypassing OpenClaw entirely.

### Logs, screenshots, and evidence

```shell
oMLX dashboard on 2026.5.7:
  Total Prefill Tokens: 748,811
  Cached Tokens: 0
  Cache Efficiency: 0.0%

Direct requests to same backend (bypassing OpenClaw):
  Request 1: cached_tokens = 0
  Request 2: cached_tokens = 71680
  Request 3: cached_tokens = 71680

Outgoing request keys observed from OpenClaw on 2026.5.7:
  model, messages, stream, max_completion_tokens, tools, reasoning_effort, metadata
  (prompt_cache_key absent)

2026.3.2 tested on same UGREEN NAS hardware: caching works correctly.
No config changes between versions.
```

### Impact and severity

Affected: any user of openai-completions with a prefix-caching-capable backend (oMLX, llama.cpp, etc.)
Severity: high — defeats prefix caching entirely, causing full prefill on every request
Frequency: 100% reproducible on 2026.5.7, never occurs on 2026.3.2
Consequence: significantly increased latency and compute cost per request;
  on local hardware this is the difference between ~3s and ~60s TTFT for long contexts.

### Additional information

Last known good version: 2026.3.2
First known bad version: 2026.5.x (exact first-bad version not tested between 2026.3.2 and 2026.5.7)
No workaround found on 2026.5.x short of downgrading.
Related: #69272, PR #69411 — those addressed the transport condition; 
this regression suggests something in that chain changed between 2026.3.2 and 2026.5.x.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bug]: OpenAI-completions prompt_cache_key regression — caching worked in 2026.3.x, broken in 2026.5.x #81281

Bug type

Beta release blocker

Summary

Steps to reproduce

Expected behavior

Actual behavior

OpenClaw version

Operating system

Install method

Model

Provider / routing chain

Additional provider/model setup details

Logs, screenshots, and evidence

Impact and severity

Additional information

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

[Bug]: OpenAI-completions prompt_cache_key regression — caching worked in 2026.3.x, broken in 2026.5.x #81281

Description

Bug type

Beta release blocker

Summary

Steps to reproduce

Expected behavior

Actual behavior

OpenClaw version

Operating system

Install method

Model

Provider / routing chain

Additional provider/model setup details

Logs, screenshots, and evidence

Impact and severity

Additional information

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions