Skip to content

Add <steer> metadata so model knows what they are#1

Open
snickell wants to merge 4 commits intomainfrom
seth/steer-metadata
Open

Add <steer> metadata so model knows what they are#1
snickell wants to merge 4 commits intomainfrom
seth/steer-metadata

Conversation

@snickell
Copy link
Copy Markdown
Owner

@snickell snickell commented Apr 6, 2026

Mark steers with <steer>

Providing the model with the same language as the user improves communication and increases the chance of contextually appropriate responses. Models without contextual information cannot apply their intelligence.

Codex currently sees "steers" and ordinary user messages as the same thing in the model-visible transcript, with no way to differentiate.

When it currently receives a steer, its not contextually clear to it that it should address the steer and KEEP WORKING. As a result, it frequently will address the steer (reply to user, change slightly) and then end its turn. This is frustrating if its something you want it to keep working on for say the next hour.

As a user, this makes steers feel "risky", because N% of the time they interrupt the current work by accident. You end up mentally weighing if the risk of it stopping work is worth the nit.

This PR fixes openai#11062.

This PR

  1. prepends a <steer></steer> control message fragment prior to sending the user's message
  2. adds a basic test: steer_input_queues_model_visible_marker_before_user_message
  3. adds brief prompt instruction for how to intepret a control message:
    When a <steer> block appears, treat the next user message as steering for your current task. Handle it, then continue your current task. Do not end your turn unless explicitly requested.

Other approaches taken to workaround

I've tried addressing this via AGENTS.md instructions, but the fundamental issue is that Codex can't tell the different between what's a steer and what's a regular prompt (where it should return control back to dialog after a response).

I've had some success, but it increases the likelihood that a non-steer will launch codex into a long running task. Fundamentally, codex has no real way of guessing which is which (a human fed the same info would struggle too).

snickell added 2 commits April 5, 2026 23:55
Steered messages currently look identical to ordinary user messages in the model-visible transcript. That leaves the model guessing whether a mid-turn input should hand control back to the user or continue the active task, which is exactly the failure described in openai#11062.

Emit a model-visible <steer> fragment immediately before steered input, exclude that marker from memory generation, and add a terse base-prompt rule that treats the following user message as same-task steering. Keep the actual user text as a normal user message so the metadata stays separate from user content.

Add coverage for marker detection, pending-input ordering, and mailbox delivery while steered input is in flight.
Use a bare <steer></steer> marker instead of a nested scope payload.

The steer fragment only needs to tell the model that the following user message is same-task steering. Keeping it empty makes the marker more direct, keeps it in-family with the existing fragment system, and avoids pretending there is a richer steer schema than the implementation actually uses.
@snickell snickell changed the title Add <steer> metadata so Codex knows what they are Add <steer> metadata so model knows what they are Apr 6, 2026
@snickell snickell marked this pull request as ready for review April 6, 2026 10:20
snickell added 2 commits April 6, 2026 00:24
Keep the steer follow-up diff tight.

Use a local import so the pending-input call stays on one simple line after rustfmt, and restore the nearby memory-exclusion comment to track upstream wording more closely instead of carrying avoidable prose churn.
Move the steer rule under the task-execution criteria list and tighten the wording to the final form.

The new text is shorter, more direct, and matches the intended behavior: handle the steer, continue the current task, and do not end the turn unless explicitly requested.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Steering broken - codex stops ongoing work and steers off completely

1 participant