Agent jobs (spawn_agents_on_csv) + progress UI by daveaitel-openai · Pull Request #10935 · openai/codex

daveaitel-openai · 2026-02-06T20:17:05Z

Summary

Add agent job support: spawn a batch of sub-agents from CSV, auto-run, auto-export, and store results in SQLite.
Simplify workflow: remove run/resume/get-status/export tools; spawn is deterministic and completes in one call.
Improve exec UX: stable, single-line progress bar with ETA; suppress sub-agent chatter in exec.

Why

Enables map-reduce style workflows over arbitrarily large repos using the existing Codex orchestrator. This addresses review feedback about overly complex job controls and non-deterministic monitoring.

Demo (progress bar)

./codex-rs/target/debug/codex exec \
  --enable collab \
  --enable sqlite \
  --full-auto \
  --progress-cursor \
  -c agents.max_threads=16 \
  -C /Users/daveaitel/code/codex \
  - <<'PROMPT'
Create /tmp/agent_job_progress_demo.csv with columns: path,area and 30 rows:
path = item-01..item-30, area = test.

Then call spawn_agents_on_csv with:
- csv_path: /tmp/agent_job_progress_demo.csv
- instruction: "Run `python - <<'PY'` to sleep a random 0.3–1.2s, then output JSON with keys: path, score (int). Set score = 1."
- output_csv_path: /tmp/agent_job_progress_demo_out.csv
PROMPT

Review feedback addressed

Auto-start jobs on spawn; removed run/resume/status/export tools.
Auto-export on success.
More descriptive tool spec + clearer prompts.
Avoid deadlocks on spawn failure; pending/running handled safely.
Progress bar no longer scrolls; stable single-line redraw.

Tests

cd codex-rs && cargo test -p codex-exec
cd codex-rs && cargo build -p codex-cli

github-actions · 2026-02-06T20:17:18Z

All contributors have signed the CLA ✍️ ✅
_{Posted by the CLA Assistant Lite bot.}

jif-oai

A few global points:

We need more integration tests
We don't have a mechanism to recover a sub-agent that crashes. In those cases, the job will stay as "Running" forever I think
Implementation is way cleaner than before. Thanks for this
Is it on purpose that this is only supported by codex exec for now? I think this might also be relevant for the app-server but it can come in a follow-up

jif-oai · 2026-02-09T11:10:10Z

codex-rs/core/src/tools/handlers/agent_jobs.rs

+    csv_path: String,
+    instruction: String,
+    id_column: Option<String>,
+    job_name: Option<String>,


This does not seem to be used anywhere... so I'm not sure this is interesting

jif-oai · 2026-02-09T11:12:14Z

codex-rs/core/src/tools/handlers/agent_jobs.rs

+}
+
+fn normalize_concurrency(requested: Option<usize>, max_threads: Option<usize>) -> usize {
+    let requested = requested.unwrap_or(64).max(1);


Can we make the 64 a const somewhere?

In a follow-up we should make this configurable

64 is gigantic. Any normal user would instantaneously get rate limited

jif-oai · 2026-02-09T11:13:15Z

codex-rs/core/src/tools/spec.rs

+    ToolSpec::Function(ResponsesApiTool {
+        name: "report_agent_job_result".to_string(),
+        description:
+            "Worker-only tool to report a result for an agent job item. Main agents should not call this."


This description is pretty strange from an agent point of view. If the main agent should not call this, we should just not give access to the tool to the main agent. Should be quite easy to do

jif-oai · 2026-02-09T11:14:03Z

codex-rs/core/src/tools/spec.rs

+    ToolSpec::Function(ResponsesApiTool {
+        name: "report_agent_job_result".to_string(),
+        description:
+            "Worker-only tool to report a result for an agent job item. Main agents should not call this."


What if the worker does not call this tool? We could just use a structured output if we want something more deterministic

jif-oai · 2026-02-09T11:14:37Z

codex-rs/core/src/tools/spec.rs

        builder.register_handler("close_agent", collab_handler);
    }

+    if config.collab_tools {


I just enabled collab globally so feel free to create a new feature flag if you want

codex-rs/core/src/tools/handlers/agent_jobs.rs

jif-oai · 2026-02-09T11:22:20Z

codex-rs/state/migrations/0010_agent_jobs_auto_export.sql

@@ -0,0 +1,2 @@
+ALTER TABLE agent_jobs


No need for a dedicated migration... this is not merged yet so just put everything in the same migration
Having tons of migrations just makes things harder to follow IMO

jif-oai · 2026-02-09T13:11:58Z

codex-rs/core/src/tools/handlers/agent_jobs.rs

+                .unwrap_or_else(|| format!("row-{row_index}"));
+            if !seen_ids.insert(item_id.clone()) {
+                item_id = format!("{item_id}-{row_index}");
+                seen_ids.insert(item_id.clone());


this can still violate uniqueness if you have duplicate item_id (unlikely but possible)

jif-oai · 2026-02-09T13:14:14Z

codex-rs/core/src/tools/handlers/agent_jobs.rs

+            let row_object = headers
+                .iter()
+                .zip(row.iter())
+                .map(|(header, value)| (header.clone(), Value::String(value.clone())))


are we 100% sure header is unique? I'm not sure the validation will never leak. A few tests would be nice here

jif-oai · 2026-02-09T13:19:37Z

codex-rs/state/src/runtime.rs

+    job_id = ?
+    AND item_id = ?
+    AND status = ?
+    AND (assigned_thread_id IS NULL OR assigned_thread_id = ?)


Why this? That sounds racy... what if report_agent_job_item_result accepts reports when assigned_thread_id is NULL? Any caller can claim an item before the worker thread is set. That can misattribute results in the Running -> set_thread window

daveaitel-openai · 2026-02-11T16:07:06Z

Thanks for the thoughtful review, really appreciated. I went through each point and addressed them as follows:

“We need more integration tests.”
Agreed. I added integration coverage for the agent job flow (spawn, report, export).
Commit: 904d050
“Job can get stuck running forever if a worker fails.”
Fixed. We now mark spawn failures as failed immediately (so they don’t linger in Running), and we added a stale‑running reaper so jobs can’t hang indefinitely. The final job result includes a failure summary so it’s obvious which rows failed and why.
Commits: 79e19fe, b457217
Follow‑up improvement: the timeout is now configurable per job (max_runtime_seconds) and via config (agents.job_max_runtime_seconds).
Commits: 8eef34d, 8177ffa
“Exec‑only for now? app‑server later?”
That’s intentional for this PR. I left app‑server support as a follow‑up to keep scope contained.
“Spawn args struct seems unused.”
It is used. SpawnAgentsOnCsvArgs is the deserialized input for spawn_agents_on_csv in codex-rs/core/src/tools/handlers/agent_jobs.rs. The handler reads csv_path, instruction, id_column, job_name, output_schema, max_runtime_seconds, etc. These fields directly drive CSV parsing, job metadata, output schema
storage, and runtime limits. No code change needed here.
“Make 64 a const; make it configurable; 64 is too high.”
Done. Extracted a constant, lowered default to 16, and made it configurable.
Commits: 79e19fe, 96a645e
“report_agent_job_result shouldn’t be visible to main agent.”
Done. The tool is now worker‑only (gated by session source), so it’s not exposed to the main agent.
Commit: 9442a57
“What if the worker doesn’t call this tool? Maybe structured output?”
We decided not to add a fallback. Instead, we document that a missing report is treated as a failure.
Commit: 7bac3af
“Collab tools enabled globally.”
No change needed. Existing gating handles the worker‑only tool exposure.
“Rename AgentJobsHandler → BatchJob.”
Done. Renamed to BatchJobHandler.
Commit: 8c58fb3
“Progress emitter init time is wrong.”
Fixed emit timing so “last emit” and “start” aren’t tied together.
Commit: 79e19fe
“spawn_agents_on_csv needs docs.”
Added doc comment and improved tool description/semantics.
Commit: 8c58fb3
“Missing state DB should be fatal.”
Done. It’s now a fatal error rather than RespondToModel.
Commit: 984a369
“Migrations: fold 0010 into 0009.”
Done. Auto‑export folded into the base migration.
Commit: 174861c
“Duplicate item_id / header uniqueness.”
Added unique header validation + robust dedupe for item IDs.
Commit: 79e19fe
“Thread assignment race in report.”
Fixed the SQL race around assigned_thread_id.
Commit: 4d2f379
“Tool descriptions unclear.”
Updated descriptions and clarified reporting semantics.
Commits: fae51af, 7bac3af

Additional fixes since the review (from local failures while testing)

SQLite state DB default for subagents
Added agents.sqlite_home + CODEX_SQLITE_HOME and default to a temp dir when in workspace‑write mode (otherwise CODEX_HOME). State DB now uses this, so subagents don’t fail with “sqlite state db unavailable” in full‑auto runs.
Commit: ceac6a0
Fix agent_jobs insert error
The agent_jobs INSERT now has the correct number of values (was 14 for 15 columns).
Commit: ceac6a0
spawn_agents_on_csv export guard
Ensures output CSV is exported even if the tool doesn’t return an output path.
Commit: ceac6a0

jif-oai

Thanks for the prompt work on the previous review. We are close and this is going in a way better direction IMO

jif-oai · 2026-02-13T14:13:17Z

codex-rs/core/src/config/mod.rs

+pub(crate) const DEFAULT_AGENT_JOB_MAX_RUNTIME_SECONDS: Option<u64> = None;

 pub const CONFIG_TOML_FILE: &str = "config.toml";
+const SQLITE_HOME_ENV: &str = "CODEX_SQLITE_HOME";


I think this part should live in the codex-state crate but feel free to challenge. No very strong opinion tbh

jif-oai · 2026-02-13T14:15:04Z

codex-rs/core/src/tools/handlers/agent_jobs.rs

+    csv_path: String,
+    instruction: String,
+    id_column: Option<String>,
+    job_name: Option<String>,


My point here was not the whole struct but just job_name. I'm not sure this is relevant to have it if this is not surfaces. Would be cool later to be able to resume from it though

jif-oai · 2026-02-13T14:17:20Z

codex-rs/core/src/tools/handlers/agent_jobs.rs

+    input_csv_path.with_file_name(format!("{stem}.agent-job-{job_suffix}.csv"))
+}
+
+fn parse_csv(content: &str) -> Result<(Vec<String>, Vec<Vec<String>>), String> {


This is pretty cool but I would either:

Check if there are any existing crates for this if we don't want to re-invent the wheel

If not or nothing that suits our need, I would extract this in a small crate in codex-utils-... as I'm quite sure others will need it one day (+ you can put this in a small self-contained PR)

jif-oai · 2026-02-13T14:20:48Z

codex-rs/core/src/tools/handlers/agent_jobs.rs

+                FunctionCallError::RespondToModel(format!("failed to create agent job: {err}"))
+            })?;
+
+        db.mark_agent_job_running(job_id.as_str())


You first mark it as running but then later you have

let options = build_runner_options(&session, &turn, requested_concurrency).await?;

so this means that if build_runner_options fails, the job stays running forever

jif-oai · 2026-02-13T14:23:32Z

codex-rs/state/src/model/agent_job.rs

+    pub row_json: Value,
+}
+
+#[derive(Debug)]


You can just derive FromRow as well (from SQLX) so that you don't need the try_from_row impl

jif-oai · 2026-02-13T14:24:02Z

codex-rs/state/src/model/agent_job.rs

+    }
+}
+
+#[derive(Debug)]


Same comment for the FromRow (and same comment everywhere actually)

jif-oai · 2026-02-13T14:28:04Z

codex-rs/state/src/model/agent_job.rs

+    pub(crate) instruction: String,
+    pub(crate) auto_export: i64,
+    pub(crate) max_runtime_seconds: Option<i64>,
+    pub(crate) output_schema_json: Option<String>,


This might be in a follow-up but this output_schema_json ticks me a bit. We should just transform it into a JSON-Schema (https://json-schema.org/) and used structured output. This will do a constraint sampling and ensure the schema is always respected. Ok for me not to do in PR but please add in a backlog somewhere (you can assign it to me if you don't want to do it)

jif-oai · 2026-02-13T14:29:29Z

codex-rs/core/src/tools/handlers/agent_jobs.rs

+                    .await
+                {
+                    Ok(thread_id) => thread_id,
+                    Err(CodexErr::AgentLimitReached { .. }) => {


This means we gonna loop while waiting for agent spot to be available right? This can take a lot of time if the sub-agents are handled somewhere else. I don't have a way better solution though

jif-oai · 2026-02-13T14:34:07Z

codex-rs/core/src/tools/handlers/agent_jobs.rs

+        .list_agent_job_items(job_id, Some(codex_state::AgentJobItemStatus::Running), None)
+        .await?;
+    for item in running_items {
+        if is_item_stale(&item, runtime_timeout) {


Would be good to also try to kill the job just in case. To make sure we don't increase contention on max number of running agents

jif-oai · 2026-02-13T14:35:24Z

codex-rs/core/src/tools/spec.rs


+    if config.agent_jobs_tools {
+        let agent_jobs_handler = Arc::new(BatchJobHandler);
+        builder.push_spec(create_spawn_agents_on_csv_tool());


OOC do you expect recursive jobs? Otherwise we could drop those for depth > 0 (opposite as spawn_agents_on_csv)

jif-oai

Ok for me with the following comments:

Enjoy the merge with main. Ping me if you want a sanity check after
As follow-ups make sure to keep track of:
a. Use of a crate for CSV handling or extract in a dedicated crate
b. Discuss with the TUI and App team to see how can we render this feature
c. Add the documentation somewhere here https://github.com/openai/developers-website. You can ask @dkundel-openai for help
d. try to use structured output for the enforcement of the schema
e. find a solution to limit the looping of AgentLimitReached
f. as this contain DB migration, make sure an alpha get cut and this alpha is used by VSCE and the app

When view_image returns an input_image, also enqueue it as a user message so nested tool calls (like js_repl) make the image available to the next model request. Log a warning if no active turn is present.

If cargo_bin(codex) fails, derive a nearby codex path from current_exe and use it for codex_linux_sandbox_exe. This keeps sandboxed test helpers working across build layouts.

Rewrite the fallback for locating the codex binary to satisfy clippy::collapsible_if while preserving the existing behavior.

daveaitel-openai · 2026-02-24T19:42:01Z

I have read the CLA Document and I hereby sign the CLA

Thread tool call source through ToolInvocation so view_image only injects pending image input for js_repl calls. Update router/tests/handlers to carry the new field.

Replace eprint! with eprintln! for newline output and collapse the columns guard to satisfy clippy::print_with_newline and clippy::collapsible_if.

Wrap agent job progress stats in a struct and replace the newline-only eprint with eprintln to satisfy clippy::too_many_arguments and clippy::print_with_newline.

daveaitel-openai · 2026-02-24T20:53:08Z

/merge

jif-oai reviewed Feb 9, 2026

View reviewed changes

etraut-openai added the oai PRs contributed by OpenAI employees label Feb 9, 2026

jif-oai reviewed Feb 13, 2026

View reviewed changes

daveaitel-openai force-pushed the feat/swarmmode-squash branch from ceac6a0 to 8c706ec Compare February 19, 2026 19:26

jif-oai approved these changes Feb 20, 2026

View reviewed changes

daveaitel-openai force-pushed the feat/swarmmode-squash branch from 8c706ec to 17e7f4d Compare February 24, 2026 19:20

daveaitel-openai added 22 commits February 24, 2026 14:25

Agent jobs + progress line fixes

9372e14

Auto-export agent job results even with failures

8748190

Make agent spawn depth configurable

a9eecde

Add agent jobs integration tests

53f301f

Fix agent job thread assignment race

0b98339

Improve agent job runtime tracking

d507535

Add max_workers alias for agent jobs

dcf0ba0

Report agent job failure summaries

802103a

Make missing state DB a fatal error

baa9427

Limit agent job result tool to worker sessions

e1ad3fb

Limit agent job result tool to worker sessions

92bf73f

Clarify spawn_agents_on_csv description

5c43605

Fold agent job auto_export into base migration

1f976c4

Rename batch job handler and document flow

eebae29

Document agent job failure when workers skip reporting

d4df36d

Lower default agent job concurrency to 16

29f0fef

Make agent job runtime timeout configurable

eabbc8f

Add config default for agent job runtime timeout

6d6da0a

Default sqlite state DB for subagents and fix agent_jobs insert

917ecaf

WIP: agent jobs csv updates

c4f0227

Add agent jobs CSV spawn support

20184cb

Allow agent job workers to cancel remaining items

32a2565

daveaitel-openai added 4 commits February 24, 2026 14:25

Fix tools config params and agent job migrations

b18284b

Fix shell snapshot handling in tests

918bfa3

core: inject view_image input into pending turn

273eeed

When view_image returns an input_image, also enqueue it as a user message so nested tool calls (like js_repl) make the image available to the next model request. Log a warning if no active turn is present.

core: fall back to target debug codex in tests

db760f0

If cargo_bin(codex) fails, derive a nearby codex path from current_exe and use it for codex_linux_sandbox_exe. This keeps sandboxed test helpers working across build layouts.

daveaitel-openai force-pushed the feat/swarmmode-squash branch from 17e7f4d to db760f0 Compare February 24, 2026 19:26

daveaitel-openai enabled auto-merge (squash) February 24, 2026 19:27

core: collapse codex test path fallback

6d1d16c

Rewrite the fallback for locating the codex binary to satisfy clippy::collapsible_if while preserving the existing behavior.

github-actions bot added a commit that referenced this pull request Feb 24, 2026

@daveaitel-openai has signed the CLA in #10935

927654b

daveaitel-openai added 3 commits February 24, 2026 14:52

core: plumb tool call source for view_image

fb83795

Thread tool call source through ToolInvocation so view_image only injects pending image input for js_repl calls. Update router/tests/handlers to carry the new field.

exec: fix clippy in human output renderer

ef9c33d

Replace eprint! with eprintln! for newline output and collapse the columns guard to satisfy clippy::print_with_newline and clippy::collapsible_if.

exec: reduce progress formatting args

7c79c44

Wrap agent job progress stats in a struct and replace the newline-only eprint with eprintln to satisfy clippy::too_many_arguments and clippy::print_with_newline.

daveaitel-openai merged commit dcab401 into main Feb 24, 2026
57 of 61 checks passed

daveaitel-openai deleted the feat/swarmmode-squash branch February 24, 2026 21:00

github-actions bot locked and limited conversation to collaborators Feb 24, 2026

Conversation

daveaitel-openai commented Feb 6, 2026

Summary

Why

Demo (progress bar)

Review feedback addressed

Tests

Uh oh!

github-actions bot commented Feb 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jif-oai left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

daveaitel-openai commented Feb 11, 2026

Uh oh!

jif-oai left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jif-oai left a comment

Choose a reason for hiding this comment

Uh oh!

daveaitel-openai commented Feb 24, 2026

Uh oh!

daveaitel-openai commented Feb 24, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

github-actions bot commented Feb 6, 2026 •

edited

Loading