Skip to content

Persist and prewarm agent tasks per thread#17978

Merged
adrian-openai merged 9 commits into
mainfrom
dev/adrian/codex/agent-task-state-prewarm
Apr 19, 2026
Merged

Persist and prewarm agent tasks per thread#17978
adrian-openai merged 9 commits into
mainfrom
dev/adrian/codex/agent-task-state-prewarm

Conversation

@adrian-openai

Copy link
Copy Markdown
Contributor

Summary

  • persist registered agent tasks in the session state update stream so the thread can reuse them
  • prewarm task registration once identity registration succeeds, while keeping startup failures best-effort
  • isolate the session-side task lifecycle into a dedicated module so AgentIdentityManager and RegisteredAgentTask do not leak across as many core layers

Testing

  • cargo test -p codex-core startup_agent_task_prewarm
  • cargo test -p codex-core cached_agent_task_for_current_identity_clears_stale_task
  • cargo test -p codex-core record_initial_history_

Comment thread codex-rs/app-server/tests/suite/v2/client_metadata.rs Outdated
Comment thread codex-rs/app-server/tests/suite/v2/client_metadata.rs Outdated
Comment thread codex-rs/protocol/src/protocol.rs
@adrian-openai

Copy link
Copy Markdown
Contributor Author

Stack navigation for this slice:

@adrian-openai adrian-openai force-pushed the dev/adrian/codex/agent-identity-register-task branch from b65f7d4 to dfc5e05 Compare April 16, 2026 18:11
adrian-openai added a commit that referenced this pull request Apr 16, 2026
## Summary

Stack PR3 for feature-gated agent identity support.

This PR adds per-thread agent task registration behind
`features.use_agent_identity`. Tasks are minted on the first real user
turn and cached in thread runtime state for later turns.

## Stack

- PR1: #17385 - add
`features.use_agent_identity`
- PR2: #17386 - register agent
identities when enabled
- PR3: #17387 - this PR, original
task registration slice
- PR3.1: #17978 - persist and
prewarm registered tasks per thread
- PR4: #17980 - use `AgentAssertion`
downstream when enabled

## Validation

Covered as part of the local stack validation pass:

- `just fmt`
- `cargo test -p codex-core --lib agent_identity`
- `cargo test -p codex-core --lib agent_assertion`
- `cargo test -p codex-core --lib websocket_agent_task`
- `cargo test -p codex-api api_bridge`
- `cargo build -p codex-cli --bin codex`

## Notes

The full local app-server E2E path is still being debugged after PR
creation. The current branch stack is directionally ready for review
while that follow-up continues.
Base automatically changed from dev/adrian/codex/agent-identity-register-task to main April 16, 2026 21:30
@adrian-openai adrian-openai force-pushed the dev/adrian/codex/agent-task-state-prewarm branch 2 times, most recently from ea5ec0a to 56f4b38 Compare April 17, 2026 03:35
@adrian-openai adrian-openai force-pushed the dev/adrian/codex/agent-task-state-prewarm branch from 56f4b38 to 4c0010f Compare April 17, 2026 04:01
Comment on lines +27 to +28
// These tests start full app-server processes; keep headroom for concurrent debug startup.
const DEFAULT_READ_TIMEOUT: std::time::Duration = std::time::Duration::from_secs(30);

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we doing this only for debug? Would like to understand better what/if the code change here is causing longer run loop or something.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! This has been updated.

@nicksteele-oai

Copy link
Copy Markdown

@codex review

@chatgpt-codex-connector

Copy link
Copy Markdown
Contributor

Codex Review: Didn't find any major issues. Chef's kiss.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@adrian-openai adrian-openai force-pushed the dev/adrian/codex/agent-task-state-prewarm branch from 3dcdb8d to 67cff31 Compare April 17, 2026 17:20
}

pub(crate) async fn task_matches_current_binding(&self, task: &RegisteredAgentTask) -> bool {
pub(crate) async fn task_matches_current_identity(&self, task: &RegisteredAgentTask) -> bool {

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is going to be a catch 22 situation - probably will be answered in a later PR and I will come back to comment on this. What happens if there is a mismatch on the agent runtime id between the session vs the task? Agent identity has ttl that can expire mid session and be reacquired?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay I am back - my concern still stands and will leave a PR level comment.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed directly - generally this should not happen in the current implementation.

Comment on lines +22 to +29
fn latest_persisted_agent_task(
rollout_items: &[RolloutItem],
) -> Option<Option<SessionAgentTask>> {
rollout_items.iter().rev().find_map(|item| match item {
RolloutItem::SessionState(update) => Some(update.agent_task.clone()),
_ => None,
})
}

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not 100% sure if we want to actually persist agent task as a part of the rollout? Any downside for this to be in memory and limited the usage to the current session and on future resume we generate a new one or conceptually it is incorrect?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed in person - we want the task to be persisted so that we can keep permissions the same across the entirety of the rollout, and not start from scratch every so often.

RolloutItem::EventMsg(_) | RolloutItem::SessionMeta(_) => {}
RolloutItem::EventMsg(_)
| RolloutItem::SessionMeta(_)
| RolloutItem::SessionState(_) => {}

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ties to if we want to store sessionState at all.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still think we do :)


// record_initial_history can emit events. We record only after the SessionConfiguredEvent is emitted.
sess.record_initial_history(initial_history).await;
sess.start_agent_identity_registration();

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DO we want to await on this?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[codex] We intentionally do not await this at startup. record_initial_history(...) needs to run first so resumed SessionState can restore/clear the persisted task before prewarm can cache a new one; after that, startup registration is best-effort prewarm. The actual user-turn path awaits ensure_agent_task_registered() in session/turn.rs, emits an error for that thread if registration fails, and retries on the next turn rather than blocking session startup or shutting down other threads.

@shijie-oai shijie-oai left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Main question is that do we need task agent stored within the rollout? I do not see the downside (and to be fair in most situation) to recreate the task identity when we resume (i.e. huge window span).

@efrazer-oai

Copy link
Copy Markdown
Contributor

Main question is that do we need task agent stored within the rollout? I do not see the downside (and to be fair in most situation) we would need to recreate the task identity when we resume (i.e. huge window span).

I think I pushed for this originally, but I think you'd have better intuition here @shijie-oai. Core idea was to keep the task lifecycle very close to a session lifecycle (e.g. we can have multiple tasks per rollout now), but i'm g either way.

@shijie-oai shijie-oai left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not seeing a clear failure handling if an agent or task validation is outdated when checking against backend.

If you are agent identity only (for enterprise use case), it is okay to hard fail as we need to strictly enforce defined TTL. But if you are chatgpt auth and we assign agent identity on the fly, I wonder what is the default TTL for that agent identity is and how we should best recover. Maybe that agent identity has the same life cycle as chatgpt refresh token?

@adrian-openai adrian-openai force-pushed the dev/adrian/codex/agent-task-state-prewarm branch from e6f7ce3 to 0000662 Compare April 18, 2026 03:16
@adrian-openai adrian-openai merged commit e5b52a3 into main Apr 19, 2026
35 of 36 checks passed
@adrian-openai adrian-openai deleted the dev/adrian/codex/agent-task-state-prewarm branch April 19, 2026 22:45
@github-actions github-actions Bot locked and limited conversation to collaborators Apr 19, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants