feat(agent-handoff): inject identity marker on mid-session agent change#4003
feat(agent-handoff): inject identity marker on mid-session agent change#4003PeterPonyu wants to merge 9 commits into
Conversation
There was a problem hiding this comment.
1 issue found across 7 files
Confidence score: 4/5
- This PR is likely safe to merge, with only a mild-to-moderate risk from one concrete issue rather than a functional blocker.
- In
src/plugin/chat-message.ts,recordAgentObservationwrites session data into a module-levelMapthat is not cleared in production cleanup paths, which can cause gradual memory growth over time. - Severity is moderate (4/10) and does not indicate an immediate breakage, but the high confidence (8/10) makes it worth addressing soon to avoid longer-term stability impact.
- Pay close attention to
src/plugin/chat-message.ts- ensure agent-handoff session entries are removed during normal session teardown.
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="src/plugin/chat-message.ts">
<violation number="1" location="src/plugin/chat-message.ts:198">
P2: Unbounded map growth in agent-handoff state: `recordAgentObservation` stores per-session entries in a module-level Map, but `clearAgentObservation` is never called in production code during session cleanup.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
| ): Promise<void> => { | ||
| if (input.agent) { | ||
| setSessionAgent(input.sessionID, input.agent) | ||
| const priorAgent = recordAgentObservation(input.sessionID, input.agent) |
There was a problem hiding this comment.
P2: Unbounded map growth in agent-handoff state: recordAgentObservation stores per-session entries in a module-level Map, but clearAgentObservation is never called in production code during session cleanup.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/plugin/chat-message.ts, line 198:
<comment>Unbounded map growth in agent-handoff state: `recordAgentObservation` stores per-session entries in a module-level Map, but `clearAgentObservation` is never called in production code during session cleanup.</comment>
<file context>
@@ -194,6 +195,13 @@ export function createChatMessageHandler(args: {
): Promise<void> => {
if (input.agent) {
setSessionAgent(input.sessionID, input.agent)
+ const priorAgent = recordAgentObservation(input.sessionID, input.agent)
+ if (priorAgent) {
+ output.parts.unshift({
</file context>
There was a problem hiding this comment.
1 issue found across 2 files (changes from recent commits).
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="src/plugin/chat-message.test.ts">
<violation number="1" location="src/plugin/chat-message.test.ts:885">
P3: This test does not actually prove the marker is prepended ahead of existing parts; it uses an empty `parts` array, so an append regression would still pass.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
There was a problem hiding this comment.
1 issue found across 4 files (changes from recent commits).
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="src/features/agent-handoff/state.ts">
<violation number="1" location="src/features/agent-handoff/state.ts:4">
P2: `clearAgentObservation` is exported but never called outside tests; module-level per-session Maps grow unbounded and lack lifecycle cleanup</violation>
</file>
Tip: Review your code locally with the cubic CLI to iterate faster.
| import { getAgentConfigKey } from "../../shared/agent-display-names" | ||
|
|
||
| const lastObservedAgent = new Map<string, string>() | ||
| const lastObservedMessageID = new Map<string, string>() |
There was a problem hiding this comment.
P2: clearAgentObservation is exported but never called outside tests; module-level per-session Maps grow unbounded and lack lifecycle cleanup
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/features/agent-handoff/state.ts, line 4:
<comment>`clearAgentObservation` is exported but never called outside tests; module-level per-session Maps grow unbounded and lack lifecycle cleanup</comment>
<file context>
@@ -1,25 +1,43 @@
import { getAgentConfigKey } from "../../shared/agent-display-names"
const lastObservedAgent = new Map<string, string>()
+const lastObservedMessageID = new Map<string, string>()
/**
</file context>
Tip: Review your code locally with the cubic CLI to iterate faster.
|
@cubic-dev-ai please re-review — both prior P2/P3 findings should now be resolved:
Touched-area suites: 49/49 pass. |
@PeterPonyu I have started the AI code review. It will take a few minutes to complete. |
There was a problem hiding this comment.
No issues found across 8 files
Confidence score: 5/5
- Automated review surfaced no issues in the provided summaries.
- No files require special attention.
Auto-approved: The handoff marker is purely additive with thorough test coverage (41 passing tests), uses messageID dedup to avoid dual-fire bugs, normalizes agent names to prevent false transitions, and cleans up state on session delete, so there is no path for regressions in existing behavior.
|
[sisyphus-bot] PR sweep status check on
Approved but blocked on merge conflicts. Please rebase onto current Assigning |
|
Merged Conflict resolution — both files had small import-section conflicts that were also dup-import accidents from the original PR:
Local verification:
Agent-handoff identity-marker behavior is unchanged; the merge only updated context for the new chat-message structure on dev. |
|
[sisyphus-bot] Hi PeterPonyu. 🙏 Thanks for the agent-handoff identity-marker work. Picking this back up from the 5/16 approval. The PR is medium-scope (422 additions, 8 files) and shows CONFLICTING against current The intent (inject identity marker on mid-session agent change so the new agent's context is unambiguous) is still valid. Could you rebase against current Once it's clean and the composition is clear, I'll come back for a focused review. |
4a75e06 to
94a3482
Compare
|
Rebased onto current The handoff marker is now wrapped via output.parts.unshift({
id: `prt_${randomUUID()}`,
sessionID: input.sessionID,
messageID: input.messageID,
...createInternalAgentTextPart(
renderHandoffMarker({ prior: priorAgent, current: input.agent }),
),
})This keeps the two concerns separate but composed:
So Suites pass locally: |
94a3482 to
52d6951
Compare
…n internal markers Design: the same OMO_INTERNAL_INITIATOR_MARKER token serves two semantically distinct contexts that MUST stay separate. VISIBLE markers (createInternalAgentTextPart, no synthetic flag): PR code-yeongyu#4003 agent-handoff identity markers — the agent MUST read its new identity from the chat stream. If these were synthetic, the LLM would be blind to its own role change and fail to switch identities. HIDDEN markers (createInternalAgentContinuationTextPart, synthetic: true): PR code-yeongyu#4223 internal retry/compaction prompts — invisible to the LLM as user input so they do not pollute the agent's perceived conversation. Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
eb25d29 to
2bfad49
Compare
|
Rebased onto current upstream/dev and composed with #4223's Handoff marker is now wrapped via
Ready for merge decision. |
|
recheck |
When the user Tab-switches the active agent (e.g. sisyphus →
hephaestus), opencode does update input.agent for the next turn, but
the LLM keeps introducing itself by the prior name because chat history
still contains assistant turns authored under the old persona.
This commit re-anchors the new agent on the very first turn after a
transition by prepending an <identity-handoff> marker as a text part:
<identity-handoff prior="sisyphus" current="hephaestus">
You are now hephaestus.
Prior assistant turns in this conversation were authored by
sisyphus; read them as handoff context, not as your own past output.
Apply the persona, tools, and policies configured for hephaestus
from this turn forward.
</identity-handoff>
Design notes:
- New src/features/agent-handoff/ keeps its own lastObservedAgent map
rather than reusing setSessionAgent / getSessionAgent. The latter is
first-write-wins (used by 7+ callers for "this session's primary
agent" identity) — overloading it would break those.
- Comparison is normalized via getAgentConfigKey, so display variants
like "Hephaestus - Deep Agent" don't trigger false transitions.
- Marker is unshifted to parts[0] so the LLM sees the identity reset
before any user content on this turn.
- Pre-existing /start-work integration test caught a real state-leak
across describes; file-level afterEach now resets the handoff map.
Tests: 41/41 pass across the new feature module + the existing
chat-message integration suite (with the added agent-handoff describe
block covering first-turn / same-agent / transition / display-variant
normalization / consecutive-transitions cases).
bun run typecheck: pass. bun run build: pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ction
Live opencode test surfaced a schema error: when the unshift in
chat-message.ts pushed a bare {type:"text", text:"..."} into
output.parts on an agent transition, opencode 1.14.49's
session.prompt.createUserMessage rejected the request with "Missing
key at [id]/[sessionID]/[messageID]" and the message never saved.
The unit tests passed because the test harness builds the same loose
shape, but the real opencode runtime validates against the full
TextPart schema:
{ id: string, sessionID: string, messageID: string, type: "text", text: string }
Fix:
- Marker injection now includes id (prt_<uuid>), sessionID, messageID.
- ChatMessageInput exposes messageID (already in opencode's hook
contract; our narrower type didn't carry it through).
- If messageID is undefined (test mocks, edge cases) the transition is
still RECORDED (so subsequent turns won't re-fire) but the visible
marker is skipped — we can't construct a valid part without it and
we'd rather no-op than crash.
- createMockInput now defaults messageID: "msg_test" so the existing
test suite continues to exercise the injection path.
- New tests verify part shape (id/sessionID/messageID present) and the
messageID-missing skip path.
bun run build: pass. agent-handoff + chat-message suites: 43/43 pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Live opencode diagnostic captured what was causing the double-marker
on every turn: opencode fires chat.message twice per user message,
3-5 ms apart, with the same sessionID and messageID but different
input.agent values:
call 1: agent = the user-selected agent (Hephaestus after Tab)
call 2: agent = the session-default agent (always Sisyphus -
Ultraworker on this build), an internal mechanism
Both firings hit recordAgentObservation, so the stale call kept
flipping lastObservedAgent and every subsequent turn looked like a
transition in both directions.
Fix: dedup by messageID. The first call per messageID is canonical
and records normally; subsequent calls with the same messageID are
no-ops (no observation update, no marker). If messageID is undefined
(test mocks, edge cases), every call is treated as a fresh turn and
dedup is bypassed.
- recordAgentObservation now takes an optional messageID arg and
tracks lastObservedMessageID per session.
- clearAgentObservation and _resetAllForTests both clear the new map.
- chat-message.ts passes input.messageID through.
- createMockInput now generates a fresh messageID per call by default
so existing tests model distinct turns; tests that need to simulate
the dual-fire pattern can pass the same messageID explicitly.
- New unit tests for state.ts (5) and one end-to-end integration test
in chat-message.test.ts exercise the dual-fire scenario directly.
bun run build: pass. agent-handoff + chat-message suites: 49/49 pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ngthen prepend test
Two cubic-dev-ai findings. Identified by cubic.
1. Unbounded map growth in agent-handoff state. recordAgentObservation
stores per-session entries in module-level Maps (lastObservedAgent,
lastObservedMessageID), but clearAgentObservation was only called
from tests. In a long-running opencode process that creates and
discards many sessions, those maps would grow without bound.
Fix: call clearAgentObservation(sessionID) from the session.deleted
handler in src/plugin/event.ts, alongside the existing per-session
cleanup helpers (clearSessionAgent, clearSessionModel,
clearSessionPromptParams, etc.).
2. The "marker injected as first part" integration test used an empty
parts array, so an append regression (output.parts.push instead of
unshift) would still satisfy parts[0] === marker.
Fix: seed the output with one existing text part ("hi, this is my
message") and assert both that the marker landed at index 0 AND
that the original user part is preserved at index 1. An append
regression would now fail both halves.
bun run build: pass. agent-handoff + chat-message suites: 49/49 pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…n internal markers Design: the same OMO_INTERNAL_INITIATOR_MARKER token serves two semantically distinct contexts that MUST stay separate. VISIBLE markers (createInternalAgentTextPart, no synthetic flag): PR code-yeongyu#4003 agent-handoff identity markers — the agent MUST read its new identity from the chat stream. If these were synthetic, the LLM would be blind to its own role change and fail to switch identities. HIDDEN markers (createInternalAgentContinuationTextPart, synthetic: true): PR code-yeongyu#4223 internal retry/compaction prompts — invisible to the LLM as user input so they do not pollute the agent's perceived conversation. Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
…nt markup corruption
38dfb6b to
637be75
Compare
|
Closing as too-large feature work with no surgical version. This PR is a 480-line / 10-file net-new feature (agent identity marker on mid-session agent change). It doesn't fix a specific filed issue. Will re-propose if maintainer signals interest. The XML-escape security follow-up (originally a75f064 in #4310) is a tiny standalone fix that depends on this feature's foundation existing first — also deferred. |
Summary
When a user Tab-switches the active agent mid-session (e.g. sisyphus → hephaestus), opencode does update
input.agentfor the next turn, but the LLM keeps introducing itself by the prior name because chat history still contains assistant turns authored under the old persona. This PR re-anchors the new agent on the first turn after a transition by injecting an<identity-handoff>marker as the leading text part.What's new
src/features/agent-handoff/module with:marker.ts— pure renderer of the handoff textstate.ts— per-sessionlastObservedAgentmap;recordAgentObservation(sessionID, currentAgent)returns the prior agent on a real transition, undefined otherwisechat-message.tscallsrecordAgentObservationright after the existingsetSessionAgentcapture. On a real transition, the marker is unshifted toparts[0]so the LLM sees the identity reset before any user content this turn.Example marker rendered into the user message:
Design notes
setSessionAgent/getSessionAgent. Those are first-write-wins and 7+ callers (delegate-task, call-omo-agent, background-task, prometheus-md-only, no-sisyphus-gpt, …) depend on that as "the session's primary agent identity". Overloading them would silently change behavior in those paths. Handoff detection needs last-observed semantics, which is a different concept — kept in its own map.getAgentConfigKey. Display variants like\"Hephaestus - Deep Agent\"resolve to the same key as\"hephaestus\", so renames or UI display variations don't trip false transitions.Tests
src/features/agent-handoff/cover marker rendering and the state machine (first turn, same-agent, transitions, display-variant normalization, session scoping, clear).src/plugin/chat-message.test.tsexercise the wiring through the real handler — first turn doesn't fire, same agent doesn't fire, sisyphus→hephaestus fires correctly, display variants don't trip it, two consecutive changes emit two separate markers./start-workintegration test caught a real state-leak across describes; the file-levelafterEachnow resets the handoff map.Totals: 41/41 pass across touched suites.
bun run typecheck: pass.bun run build: pass.Out of scope
/switch_agentslash command or an agent-self-switch tool — the SDK doesn't expose a way for a plugin tool to swap the active agent of an in-flight session; that would need an upstream opencode change addingclient.session.setAgent(...). The existingcall_omo_agentanddelegate-tasktools already cover the cross-agent delegation use case via subagents.Test plan
🤖 Generated with Claude Code
Summary by cubic
Injects an
<identity-handoff>marker on mid-session agent switches, prepending it to the next user turn so the model resets identity. Handles opencode’s dual-fire and schema rules to avoid duplicate or invalid parts.New Features
src/features/agent-handoff/withrenderHandoffMarkerand per-session tracking viarecordAgentObservation/clearAgentObservation.chat-message.tsdetects real transitions afterupdateSessionAgentand unshiftscreateInternalAgentTextPart(renderHandoffMarker(...))intooutput.parts[0].getAgentConfigKeyso display variants don’t trigger false switches.Bug Fixes
id/sessionID/messageID; ifmessageIDis missing, skip injection.messageIDto ignore the stale second call in the dual-fire pattern.session.deletedto prevent leaks.bun.lockto useoh-my-opencode-*4.4.0 optional binaries for CI.Written for commit 637be75. Summary will update on new commits. Review in cubic