What version of Codex CLI is running?
codex-cli 0.128.0
What subscription do you have?
ChatGPT Plus
Which model were you using?
gpt-5.5 (seed response resolved to gpt-5.5-2026-04-23)
What platform is your computer?
Microsoft Windows NT 10.0.26200.0 x64
What terminal emulator and version are you using (if applicable)?
PowerShell 7.6.1 on native Windows, not WSL.
What issue are you seeing?
Codex remote compaction can create an invalid Responses API history for
store=true requests: it drops a reasoning item but can retain the dependent
server-assigned assistant message.
This is a provider-independent invariant violation: under store=true, the
Responses API itself rejects orphaned assistant messages, and Codex's remote
compaction filter can construct payloads that trigger this rejection.
For store=true Responses requests, this creates an invalid replay shape:
assistant message with a server-assigned id, but without its required preceding reasoning item
For stored Responses requests, a server-assigned assistant message can depend on
the preceding server-assigned reasoning item. If Codex asks for
reasoning.encrypted_content, that reasoning item is the replayable form needed
to keep the stored conversation consistent.
I verified the same invalid replay shape against OpenAI-hosted /v1/responses
with store=true:
OpenAI-hosted Responses API
request model id: gpt-5.5
seed response model: gpt-5.5-2026-04-23
seed response: 200
output contained server-assigned reasoning item
output contained server-assigned assistant message
paired replay, input=[reasoning, assistant, user]: 200
orphan replay, input=[assistant, user]: 400
message: Item 'msg_[redacted]' of type 'message' was provided without its required 'reasoning' item: 'rs_[redacted]'.
The same orphan replay succeeds with store=false, so this may be hidden in
normal OpenAI-hosted Codex usage today.
I am hitting this in practice on Azure-hosted Responses today, where Codex sends
store=true, and I currently run a local fork to work around it. On
OpenAI-hosted Codex this is masked because Codex only enables store=true on
the Azure path; the API-level reproduction above confirms the same orphan replay
also fails against api.openai.com when store=true is used.
This report is based on two observations: the store=true Responses replay
invariant is independently reproducible with /v1/responses, and Codex's
remote compaction filter is asymmetric. I do not know whether the current
compact endpoint naturally emits this shape in common cases. In my short probe
on 2026-05-02, /responses/compact returned a user message plus a compaction
item, not an assistant message. The bug is that Codex explicitly allows
assistant messages from remote compact output while dropping reasoning items, so
any compact output that contains reasoning + assistant can become an invalid
store=true replay.
What steps can reproduce the bug?
1. API-level store=true invariant check
This does not require Azure.
Run this with OPENAI_API_KEY set:
import json
import os
import re
import urllib.error
import urllib.request
MODEL = os.environ.get("OPENAI_MODEL", "gpt-5.5")
BASE_URL = os.environ.get("OPENAI_BASE_URL", "https://api.openai.com/v1")
API_KEY = os.environ["OPENAI_API_KEY"]
def post(path, payload):
req = urllib.request.Request(
f"{BASE_URL}/{path}",
data=json.dumps(payload).encode(),
headers={"Authorization": f"Bearer {API_KEY}", "Content-Type": "application/json"},
method="POST",
)
try:
with urllib.request.urlopen(req, timeout=120) as res:
return res.status, json.loads(res.read())
except urllib.error.HTTPError as err:
return err.code, json.loads(err.read())
def user(text):
return {"role": "user", "content": [{"type": "input_text", "text": text}]}
seed_status, seed = post("responses", {
"model": MODEL,
"input": "Answer with exactly: store-true-seed",
"store": True,
"reasoning": {"effort": "medium"},
"include": ["reasoning.encrypted_content"],
})
assert seed_status == 200, seed
reasoning = next((i for i in seed["output"] if i.get("type") == "reasoning" and i.get("id")), None)
assistant = next((i for i in seed["output"] if i.get("type") == "message" and i.get("role") == "assistant" and i.get("id")), None)
assert reasoning is not None, f"no server-assigned reasoning item in seed output: {seed['output']}"
assert assistant is not None, f"no server-assigned assistant message in seed output: {seed['output']}"
paired_status, paired = post("responses", {
"model": MODEL,
"store": True,
"input": [reasoning, assistant, user("Continue after paired replay.")],
})
assert paired_status == 200, paired
orphan_status, orphan = post("responses", {
"model": MODEL,
"store": True,
"input": [assistant, user("Continue after orphan replay.")],
})
message = orphan.get("error", {}).get("message", "")
message = re.sub(r"(msg|rs)_[A-Za-z0-9]+", r"\1_[redacted]", message)
print("request model id:", MODEL)
print("seed response model:", seed.get("model"))
print("seed response:", seed_status)
print("paired replay:", paired_status)
print("orphan replay:", orphan_status)
print("orphan error:", message)
Observed:
request model id: gpt-5.5
seed response model: gpt-5.5-2026-04-23
seed response: 200
paired replay: 200
orphan replay: 400
orphan error: Item 'msg_[redacted]' of type 'message' was provided without its required 'reasoning' item: 'rs_[redacted]'.
The error message above is verbatim from the API response except for redacted
server item ids.
2. Codex client-side transformation
This is the Codex-side invariant break and does not require Azure either.
In codex-rs/core/src/compact_remote.rs, process_compacted_history() filters
remote compact output:
compacted_history.retain(should_keep_compacted_history_item);
The filter keeps assistant messages:
ResponseItem::Message { role, .. } if role == "assistant" => true,
but drops reasoning items:
ResponseItem::Reasoning { .. } => false,
Permalink against upstream/main at 35aaa5d9fc, fetched 2026-05-02:
|
compacted_history.retain(should_keep_compacted_history_item); |
|
insert_initial_context_before_last_real_user_or_summary(compacted_history, initial_context) |
|
} |
|
|
|
/// Returns whether an item from remote compaction output should be preserved. |
|
/// |
|
/// Called while processing the model-provided compacted transcript, before we |
|
/// append fresh canonical context from the current session. |
|
/// |
|
/// We drop: |
|
/// - `developer` messages because remote output can include stale/duplicated |
|
/// instruction content. |
|
/// - non-user-content `user` messages (session prefix/instruction wrappers), |
|
/// while preserving real user messages and persisted hook prompts. |
|
/// |
|
/// This intentionally keeps: |
|
/// - `assistant` messages (future remote compaction models may emit them) |
|
/// - `user`-role warnings and compaction-generated summary messages because |
|
/// they parse as `TurnItem::UserMessage`. |
|
fn should_keep_compacted_history_item(item: &ResponseItem) -> bool { |
|
match item { |
|
ResponseItem::Message { role, .. } if role == "developer" => false, |
|
ResponseItem::Message { role, .. } if role == "user" => { |
|
matches!( |
|
crate::event_mapping::parse_turn_item(item), |
|
Some(TurnItem::UserMessage(_) | TurnItem::HookPrompt(_)) |
|
) |
|
} |
|
ResponseItem::Message { role, .. } if role == "assistant" => true, |
|
ResponseItem::Message { .. } => false, |
|
ResponseItem::Compaction { .. } => true, |
|
ResponseItem::Reasoning { .. } |
|
| ResponseItem::LocalShellCall { .. } |
|
| ResponseItem::FunctionCall { .. } |
|
| ResponseItem::ToolSearchCall { .. } |
|
| ResponseItem::FunctionCallOutput { .. } |
|
| ResponseItem::ToolSearchOutput { .. } |
|
| ResponseItem::CustomToolCall { .. } |
|
| ResponseItem::CustomToolCallOutput { .. } |
|
| ResponseItem::WebSearchCall { .. } |
|
| ResponseItem::ImageGenerationCall { .. } |
|
| ResponseItem::Other => false, |
So a compact output like:
Reasoning(rs_...)
Message(role="assistant", id=msg_...)
can become:
Message(role="assistant", id=msg_...)
A minimal unit test in compact_remote.rs can make the current invariant break
visible:
#[test]
fn remote_compaction_filter_does_not_orphan_assistant_message() {
use codex_protocol::models::{ContentItem, ResponseItem};
let assistant = ResponseItem::Message {
id: Some("msg_test".to_string()),
role: "assistant".to_string(),
content: vec![ContentItem::OutputText {
text: "summary".to_string(),
}],
phase: None,
};
let mut compacted_history = vec![
ResponseItem::Reasoning {
id: "rs_test".to_string(),
summary: Vec::new(),
content: None,
encrypted_content: Some("encrypted".to_string()),
},
assistant,
];
compacted_history.retain(should_keep_compacted_history_item);
assert!(
!matches!(
compacted_history.as_slice(),
[ResponseItem::Message {
id: Some(_),
role,
..
}] if role == "assistant"
),
"assistant message survived without its reasoning predecessor: {compacted_history:?}"
);
}
On the current filter at 35aaa5d9fc, this test fails with the panic shown
below. After applying either fix listed under "Expected behavior", the test
should pass, making it a suitable regression check.
I confirmed it fails on 35aaa5d9fc with:
assistant message survived without its reasoning predecessor: [Message { id: Some("msg_test"), role: "assistant", content: [OutputText { text: "summary" }], phase: None }]
What is the expected behavior?
Codex should preserve the Responses item-pairing invariant when shaping remote
compaction output.
Either of these would avoid the invalid store=true replay:
- Keep
ResponseItem::Reasoning when retaining the dependent assistant item.
- Drop server-assigned assistant messages whose required reasoning predecessor
was removed.
Additional information
I also checked current upstream/main as of 2026-05-02.
Current Codex sets the Responses request store value like this:
store: provider.is_azure_responses_endpoint()
Permalink against upstream/main at 35aaa5d9fc, fetched 2026-05-02:
|
store: provider.is_azure_responses_endpoint(), |
Impact:
- The root cause is a client-side invariant violation in
process_compacted_history(). OpenAI-hosted Responses API rejects the same
orphaned assistant replay when store=true is used, so this is not an
Azure-only validation rule.
- The failure surfaces today for Azure-hosted Responses endpoints because Codex
sets store=true only when provider.is_azure_responses_endpoint() is true.
- Any future Codex path or provider that uses
store=true can hit the same
invariant violation unless Codex preserves the reasoning/assistant pair.
Suggested fix:
After filtering remote compact output, enforce that retained server-assigned
assistant messages still have their required preceding reasoning item.
For example, I would add a small helper like:
compacted_history.retain(should_keep_compacted_history_item);
remove_orphaned_assistant_messages(&mut compacted_history); // new helper
Alternatively, retain Reasoning items from remote compact output when retaining
dependent assistant messages.
I would be happy to open a PR with either approach if maintainers have a
preference.
What version of Codex CLI is running?
codex-cli 0.128.0
What subscription do you have?
ChatGPT Plus
Which model were you using?
gpt-5.5 (seed response resolved to
gpt-5.5-2026-04-23)What platform is your computer?
Microsoft Windows NT 10.0.26200.0 x64
What terminal emulator and version are you using (if applicable)?
PowerShell 7.6.1 on native Windows, not WSL.
What issue are you seeing?
Codex remote compaction can create an invalid Responses API history for
store=truerequests: it drops areasoningitem but can retain the dependentserver-assigned assistant
message.This is a provider-independent invariant violation: under
store=true, theResponses API itself rejects orphaned assistant messages, and Codex's remote
compaction filter can construct payloads that trigger this rejection.
For
store=trueResponses requests, this creates an invalid replay shape:For stored Responses requests, a server-assigned assistant message can depend on
the preceding server-assigned reasoning item. If Codex asks for
reasoning.encrypted_content, that reasoning item is the replayable form neededto keep the stored conversation consistent.
I verified the same invalid replay shape against OpenAI-hosted
/v1/responseswith
store=true:The same orphan replay succeeds with
store=false, so this may be hidden innormal OpenAI-hosted Codex usage today.
I am hitting this in practice on Azure-hosted Responses today, where Codex sends
store=true, and I currently run a local fork to work around it. OnOpenAI-hosted Codex this is masked because Codex only enables
store=trueonthe Azure path; the API-level reproduction above confirms the same orphan replay
also fails against
api.openai.comwhenstore=trueis used.This report is based on two observations: the
store=trueResponses replayinvariant is independently reproducible with
/v1/responses, and Codex'sremote compaction filter is asymmetric. I do not know whether the current
compact endpoint naturally emits this shape in common cases. In my short probe
on 2026-05-02,
/responses/compactreturned a user message plus acompactionitem, not an assistant message. The bug is that Codex explicitly allows
assistant messages from remote compact output while dropping reasoning items, so
any compact output that contains
reasoning + assistantcan become an invalidstore=truereplay.What steps can reproduce the bug?
1. API-level
store=trueinvariant checkThis does not require Azure.
Run this with
OPENAI_API_KEYset:Observed:
The error message above is verbatim from the API response except for redacted
server item ids.
2. Codex client-side transformation
This is the Codex-side invariant break and does not require Azure either.
In
codex-rs/core/src/compact_remote.rs,process_compacted_history()filtersremote compact output:
The filter keeps assistant messages:
but drops reasoning items:
Permalink against
upstream/mainat35aaa5d9fc, fetched 2026-05-02:codex/codex-rs/core/src/compact_remote.rs
Lines 241 to 282 in 35aaa5d
So a compact output like:
can become:
A minimal unit test in
compact_remote.rscan make the current invariant breakvisible:
On the current filter at
35aaa5d9fc, this test fails with the panic shownbelow. After applying either fix listed under "Expected behavior", the test
should pass, making it a suitable regression check.
I confirmed it fails on
35aaa5d9fcwith:What is the expected behavior?
Codex should preserve the Responses item-pairing invariant when shaping remote
compaction output.
Either of these would avoid the invalid
store=truereplay:ResponseItem::Reasoningwhen retaining the dependent assistant item.was removed.
Additional information
I also checked current
upstream/mainas of 2026-05-02.Current Codex sets the Responses request
storevalue like this:Permalink against
upstream/mainat35aaa5d9fc, fetched 2026-05-02:codex/codex-rs/core/src/client.rs
Line 889 in 35aaa5d
Impact:
process_compacted_history(). OpenAI-hosted Responses API rejects the sameorphaned assistant replay when
store=trueis used, so this is not anAzure-only validation rule.
sets
store=trueonly whenprovider.is_azure_responses_endpoint()is true.store=truecan hit the sameinvariant violation unless Codex preserves the reasoning/assistant pair.
Suggested fix:
After filtering remote compact output, enforce that retained server-assigned
assistant messages still have their required preceding reasoning item.
For example, I would add a small helper like:
Alternatively, retain
Reasoningitems from remote compact output when retainingdependent assistant messages.
I would be happy to open a PR with either approach if maintainers have a
preference.