[codex] Speed up external agent session imports#26637
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 583174aca7
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| let metadata = ThreadMetadataPatch { | ||
| title, | ||
| preview: first_user_message.clone(), | ||
| model_provider: Some(model_provider), | ||
| created_at: Some(now), | ||
| updated_at: Some(now), | ||
| source: Some(source.clone()), | ||
| thread_source: Some(None), | ||
| agent_nickname: Some(source.get_nickname()), | ||
| agent_role: Some(source.get_agent_role()), | ||
| agent_path: Some(source.get_agent_path().map(Into::into)), | ||
| cwd: Some(cwd), | ||
| cli_version: Some(env!("CARGO_PKG_VERSION").to_string()), | ||
| first_user_message, | ||
| memory_mode: Some(memory_mode), | ||
| ..Default::default() |
There was a problem hiding this comment.
Preserve git metadata for imported sessions
When the imported session's cwd is inside a Git repository, the old start_thread_with_options path went through ThreadMetadataSync::for_create, which collects and persists the repo Git info. This direct ThreadStore path builds the metadata patch manually but never sets git_info, so newly imported threads lose their gitInfo in thread/list/thread/read even though normal imported rollouts used to expose it.
Useful? React with 👍 / 👎.
| let metadata = ThreadMetadataPatch { | ||
| title, | ||
| preview: first_user_message.clone(), |
There was a problem hiding this comment.
Keep imported titles in rollout-backed reads
When importing a session with a custom/AI title on installations where the SQLite state DB is unavailable or later rebuilt from rollouts, setting only the observed title updates SQLite but skips the rollout name index. The old path used the explicit name update, which wrote the legacy name, so these imported threads lose their title in rollout-backed reads/lists.
Useful? React with 👍 / 👎.
| } | ||
| } | ||
| } | ||
| if let Err(err) = record_completed_session_imports(&self.codex_home, completed_imports) { |
There was a problem hiding this comment.
Record completed imports before finishing the batch
For large imports, completed sessions stay only in memory until the whole stream finishes; if the app exits or this background task is cancelled after some persist_session calls succeed but before this ledger write, those source versions are not marked imported. The next detection/import will treat them as new and create duplicate imported threads, whereas the previous path recorded each completed session immediately.
Useful? React with 👍 / 👎.
|
|
||
| use crate::config_manager::ConfigManager; | ||
|
|
||
| const SESSION_IMPORT_CONCURRENCY: usize = 5; |
There was a problem hiding this comment.
AGENTS.md Change size guidance caps non-mechanical changes at 800 changed lines; this commit changes about 1,171 lines across import, ledger, parsing, and app-server behavior. Please split a coherent stage, such as parser/ledger batching separately from direct ThreadStore import.
Useful? React with 👍 / 👎.
| .expect("record completed imports"); | ||
|
|
||
| let ledger = super::load_import_ledger(&codex_home).expect("ledger"); | ||
| assert_eq!(ledger.records.len(), 1); |
There was a problem hiding this comment.
Use pretty assertions in the new test module
AGENTS.md Test assertions asks Rust tests to import and use pretty_assertions::assert_eq for clearer diffs. This new test module uses the standard macro throughout, so failures on ledger records will be harder to read than the repo convention expects.
Useful? React with 👍 / 👎.
Why
Importing large external-agent session histories currently starts a full live Codex thread for every imported session. This initializes unrelated runtime systems and repeats expensive transcript, metadata, hashing, and ledger work.
On a 50-session, 238 MiB fixture, the existing path took roughly 70 seconds to complete the import and 77 seconds end to end.
What changed
ThreadStoreinstead of starting full live threads.codex-coreand avoid changes to the publicThreadStoretrait.Performance
For the same 50-session, 238 MiB fixture:
All 50 sessions imported successfully with no warnings or contention signals.
Validation
just test -p codex-external-agent-sessionsjust test -p codex-app-server external_agent_config_import