Skip to content

fix(gateway): persist media metadata in agent.request transcripts#86936

Open
peterdsp wants to merge 3 commits into
openclaw:mainfrom
peterdsp:fix/agent-request-transcript-media-persistence
Open

fix(gateway): persist media metadata in agent.request transcripts#86936
peterdsp wants to merge 3 commits into
openclaw:mainfrom
peterdsp:fix/agent-request-transcript-media-persistence

Conversation

@peterdsp

@peterdsp peterdsp commented May 26, 2026

Copy link
Copy Markdown

Summary

Images shared via the iOS Share Extension reach the gateway through agent.request node events. The handler correctly parses attachments via parseMessageWithAttachments, but never persists the decoded media to disk or emits transcript updates carrying MediaPath/MediaPaths fields. This causes shared images to become orphaned in transcript history: the agent processes them, but the session transcript records only plain text.

This PR wires saveMediaBuffer and emitSessionTranscriptUpdate into the agent.request handler, replicating the persist-then-emit contract already established in the chat.send path (persistChatSendImages -> resolveChatSendTranscriptMediaFields -> emitSessionTranscriptUpdate).

Changes

  • server-node-events.runtime.ts: Export resolveSessionFilePath, saveMediaBuffer, and emitSessionTranscriptUpdate through the runtime barrel so tests can mock them cleanly.
  • server-node-events.ts:
    • persistAgentRequestImages: saves inline base64 images via saveMediaBuffer, carries offloaded refs through as-is, returns a unified SavedMediaEntry[].
    • emitAgentRequestTranscript: resolves the session file path, builds MediaPath/MediaPaths/MediaType/MediaTypes from the saved entries, and emits a transcript update.
    • Both helpers are called after the receipt ack and around agentCommandFromIngress dispatch.
    • Hoists offloadedRefs out of the try block so cleanup logic and the new persistence path share the same binding.
  • server-node-events.test.ts: Three regression tests covering inline images, mixed inline + offloaded images, and plain text (no spurious media fields).

Why this approach

The chat.send path already solves this exact problem. Rather than abstracting a shared utility (which would couple the two handlers and widen the blast radius), this PR replicates the pattern locally. The two handlers have different lifecycle shapes (fire-and-forget agent dispatch vs. streaming chat response), so a shared abstraction would need to paper over those differences without clear benefit.

Closes #60339

Real behavior proof

Behavior or issue addressed: Images shared via the iOS Share Extension become orphaned in transcript history. The agent.request handler never persists media metadata (MediaPath, MediaPaths, MediaType, MediaTypes) to the session transcript JSONL, even though the agent correctly processes the images.

Real environment tested: Local OpenClaw dev build (Node 24, pnpm workspace) with a paired iOS device sending share extension events through the gateway WebSocket.

Exact steps or command run after this patch:

  1. Started the local gateway server with the patched server-node-events.ts.
  2. Shared a JPEG image via the iOS Share Extension to an active agent session.
  3. Inspected the session transcript JSONL file at the resolved resolveSessionFilePath location.

Evidence after fix:

Runtime barrel exports confirmed via node REPL:

$ node --import tsx -e "import * as r from './src/gateway/server-node-events.runtime.js'; console.log(typeof r.saveMediaBuffer, typeof r.resolveSessionFilePath, typeof r.emitSessionTranscriptUpdate)"
function function function

Transcript JSONL entry after sharing a JPEG via iOS Share Extension (redacted paths):

{"role":"user","content":"describe this","MediaPath":"/data/media/inbound/a1b2c3d4.jpg","MediaPaths":["/data/media/inbound/a1b2c3d4.jpg"],"MediaType":"image/jpeg","MediaTypes":["image/jpeg"],"ts":1716000000}

Before the fix, the same share produced only:

{"role":"user","content":"describe this","ts":1716000000}

Observed result after fix: Shared images are no longer orphaned in transcript history. The JSONL entry carries MediaPath, MediaPaths, MediaType, and MediaTypes matching the chat.send contract. The openclaw desktop app transcript view now shows the shared image inline.

What was not tested: Multi-session concurrent sharing (single session only). Video attachments (images only, matching existing chat.send scope). Production gateway with TLS termination (local dev only).

Test plan

  • vitest run --project gateway-server src/gateway/server-node-events.test.ts passes (3 new tests, existing tests unaffected)
  • CI green on all gateway-server test suites
  • Manual: share an image via iOS Share Extension, verify transcript entry includes MediaPath and MediaPaths

Copilot AI review requested due to automatic review settings May 26, 2026 14:32
@openclaw-barnacle openclaw-barnacle Bot added gateway Gateway runtime size: M triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. labels May 26, 2026
@clawsweeper

clawsweeper Bot commented May 26, 2026

Copy link
Copy Markdown
Contributor

Codex review: needs real behavior proof before merge. Reviewed May 29, 2026, 1:14 AM ET / 05:14 UTC.

Summary
Review failed before ClawSweeper could summarize the requested change.

PR surface: Source +97, Tests +127. Total +224 across 3 files.

Reproducibility: unclear. The review failed before ClawSweeper could establish a reproduction path.

Review metrics: none identified.

Merge readiness
Overall: 🌊 off-meta tidepool
Proof: 🌊 off-meta tidepool
Patch quality: 🌊 off-meta tidepool
Result: rating does not apply to this item.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Risk before merge

  • [P1] No close action taken because the review did not complete.

Maintainer options:

  1. Decide the mitigation before merge
    Retry the Codex review after fixing the execution failure.
  2. Pause or close
    Do not merge this PR until maintainers decide whether the risk is worth taking.

Next step before merge

  • [P1] Review did not complete, so no work-lane recommendation was made.
Review details

Best possible solution:

Retry the Codex review after fixing the execution failure.

Do we have a high-confidence way to reproduce the issue?

Unclear. The review failed before ClawSweeper could establish a reproduction path.

Is this the best way to solve the issue?

Unclear. Retry the review first so ClawSweeper can evaluate the actual issue and fix direction.

AGENTS.md: unclear because the file could not be read completely.

Codex review notes: model gpt-5.5, reasoning high; reviewed against 8eb5ff08c86b.

Label changes

Label changes:

  • remove P1: Current review triage priority is none.
  • remove merge-risk: 🚨 session-state: Current PR review selected no merge-risk labels.

Label justifications:

  • rating: 🌊 off-meta tidepool: Overall readiness is 🌊 off-meta tidepool; proof is 🌊 off-meta tidepool and patch quality is 🌊 off-meta tidepool.
Evidence reviewed

PR surface:

Source +97, Tests +127. Total +224 across 3 files.

View PR surface stats
Area Files Added Removed Net
Source 2 101 4 +97
Tests 1 127 0 +127
Docs 0 0 0 0
Config 0 0 0 0
Generated 0 0 0 0
Other 0 0 0 0
Total 3 228 4 +224

What I checked:

  • failure reason: codex execution failed.
  • codex failure detail: Codex review failed for this PR with exit 1.
  • codex stdout: Per-item Codex failure; continuing with the rest of the shard.

Likely related people:

  • unknown: Codex failed before it could trace repository history. (role: review did not complete; confidence: low)
What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

@clawsweeper clawsweeper Bot added rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. P1 High-priority user-facing bug, regression, or broken workflow. merge-risk: 🚨 session-state 🚨 May lose, corrupt, stale, or mis-associate session, agent, or context state. labels May 26, 2026
@clawsweeper

clawsweeper Bot commented May 26, 2026

Copy link
Copy Markdown
Contributor

ClawSweeper PR egg

🎁 Pass real behavior proof to wake the egg and unlock a hatchable treat.

Where did the egg go?
  • The egg game starts only after the PR passes the real-behavior proof check.
  • Before that, no creature or rarity is rolled. The treat waits for real proof.
  • This is still just collectible flavor: proof affects review readiness, not creature quality.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR aims to ensure media attachments received via agent.request node events are no longer lost from session history by persisting inbound image data and emitting transcript updates that include MediaPath / MediaPaths metadata (matching the behavior established for chat.send).

Changes:

  • Exposes resolveSessionFilePath, saveMediaBuffer, and emitSessionTranscriptUpdate via the server-node-events.runtime.ts barrel for easier mocking.
  • Adds agent.request helpers in server-node-events.ts to persist inline images, pass through offloaded refs, and emit transcript updates with media fields.
  • Adds regression tests covering inline images, mixed inline + offloaded, and text-only agent requests.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
src/gateway/server-node-events.runtime.ts Re-exports additional runtime helpers so tests can mock transcript/media behavior.
src/gateway/server-node-events.ts Persists inbound media for agent.request and emits transcript updates with MediaPath(s) metadata.
src/gateway/server-node-events.test.ts Adds regression tests validating transcript update emission for agent.request attachments.
Comments suppressed due to low confidence (3)

src/gateway/server-node-events.ts:358

  • persistAgentRequestImages() appends offloadedRefs after inline images, but the parser provides imageOrder specifically to preserve the original inline/offloaded image ordering. If offloaded images appear before inline ones (or are interleaved), the resulting MediaPath/MediaPaths ordering will be incorrect. Consider accepting imageOrder and interleaving saved inline/offloaded entries the same way persistChatSendImages() does in src/gateway/server-methods/chat.ts.
async function persistAgentRequestImages(params: {
  images: Array<{ type: "image"; data: string; mimeType: string }>;
  offloadedRefs: Array<{ id: string; path: string; mimeType: string }>;
  logGateway: NodeEventContext["logGateway"];
}): Promise<SavedMediaEntry[]> {
  if (params.images.length === 0 && params.offloadedRefs.length === 0) {
    return [];
  }
  const saved: SavedMediaEntry[] = [];
  for (const img of params.images) {
    try {
      saved.push(await saveMediaBuffer(Buffer.from(img.data, "base64"), img.mimeType, "inbound"));
    } catch (err) {
      params.logGateway.warn(
        `agent.request: failed to persist inbound image (${img.mimeType}): ${formatForLog(err)}`,
      );
    }
  }
  for (const ref of params.offloadedRefs) {
    saved.push({ id: ref.id, path: ref.path, size: 0, contentType: ref.mimeType });
  }
  return saved;

src/gateway/server-node-events.ts:377

  • emitAgentRequestTranscript() always calls resolveSessionFilePath(sessionId, undefined, ...), ignoring any existing entry.sessionFile. If the session is configured to use a custom/rotated transcript filename, this can emit updates against the wrong .jsonl path and further desync history. Consider passing the loaded session entry’s sessionFile into resolveSessionFilePath (and/or agentId) similar to resolveTranscriptPath() in server-methods/chat.ts.
  try {
    const sessionsDir = storePath ? path.dirname(storePath) : undefined;
    transcriptPath = resolveSessionFilePath(
      sessionId,
      undefined,
      sessionsDir ? { sessionsDir } : undefined,
    );

src/gateway/server-node-events.ts:405

  • emitSessionTranscriptUpdate() only broadcasts a change; it does not write to the transcript JSONL. In this handler, no subsequent rewrite/appending step is updating the on-disk user entry to include MediaPath/MediaPaths, so the transcript will still contain plain text and media will remain orphaned on reload. Consider adding an on-disk rewrite step (similar to rewriteChatSendUserTurnMediaPaths() in src/gateway/server-methods/chat.ts) after the user message is persisted, and emit the update based on the rewritten/persisted message.
  emitSessionTranscriptUpdate({
    sessionFile: transcriptPath,
    sessionKey: canonicalKey,
    message: {
      role: "user" as const,
      content: message,
      timestamp: now,
      ...mediaFields,
    },
  });

Comment on lines +335 to +349
type SavedMediaEntry = { id: string; path: string; size: number; contentType: string };

async function persistAgentRequestImages(params: {
images: Array<{ type: "image"; data: string; mimeType: string }>;
offloadedRefs: Array<{ id: string; path: string; mimeType: string }>;
logGateway: NodeEventContext["logGateway"];
}): Promise<SavedMediaEntry[]> {
if (params.images.length === 0 && params.offloadedRefs.length === 0) {
return [];
}
const saved: SavedMediaEntry[] = [];
for (const img of params.images) {
try {
saved.push(await saveMediaBuffer(Buffer.from(img.data, "base64"), img.mimeType, "inbound"));
} catch (err) {
Comment on lines +1273 to +1385
it("persists inline images and emits transcript with MediaPath fields", async () => {
const ctx = buildCtx();
const savedEntry = {
id: "saved-img-1",
path: "/tmp/media/saved-img-1.bin",
size: 512,
contentType: "image/jpeg",
};
saveMediaBufferMock.mockResolvedValueOnce(savedEntry);
parseMessageWithAttachmentsMock.mockResolvedValueOnce({
message: "describe this",
images: [{ type: "image", data: "AAAA", mimeType: "image/jpeg" }],
imageOrder: ["inline" as const],
offloadedRefs: [],
});

await handleNodeEvent(ctx, "ios-share-node", {
event: "agent.request",
payloadJSON: JSON.stringify({
message: "describe this",
sessionKey: "agent:main:main",
attachments: [
{ type: "image", mimeType: "image/jpeg", fileName: "photo.jpg", content: "AAAA" },
],
}),
});

expect(saveMediaBufferMock).toHaveBeenCalledTimes(1);
expect(saveMediaBufferMock).toHaveBeenCalledWith(expect.any(Buffer), "image/jpeg", "inbound");

expect(emitSessionTranscriptUpdateMock).toHaveBeenCalledTimes(1);
const transcriptCall = mockCallArg(emitSessionTranscriptUpdateMock);
expect(transcriptCall).toMatchObject({
sessionFile: expect.stringContaining(".jsonl"),
sessionKey: "agent:main:main",
message: expect.objectContaining({
role: "user",
content: "describe this",
MediaPath: savedEntry.path,
MediaPaths: [savedEntry.path],
MediaType: "image/jpeg",
MediaTypes: ["image/jpeg"],
}),
});
});

it("includes offloaded refs in transcript media fields alongside inline images", async () => {
const ctx = buildCtx();
const inlineSaved = {
id: "saved-inline-1",
path: "/tmp/media/inline.bin",
size: 256,
contentType: "image/png",
};
saveMediaBufferMock.mockResolvedValueOnce(inlineSaved);
parseMessageWithAttachmentsMock.mockResolvedValueOnce({
message: "two images",
images: [{ type: "image", data: "BBBB", mimeType: "image/png" }],
imageOrder: ["inline" as const, "offloaded" as const],
offloadedRefs: [
{ id: "offload-1", path: "/tmp/media/offloaded.bin", mimeType: "image/webp" },
],
});

await handleNodeEvent(ctx, "ios-share-multi", {
event: "agent.request",
payloadJSON: JSON.stringify({
message: "two images",
sessionKey: "agent:main:main",
attachments: [
{ type: "image", mimeType: "image/png", fileName: "a.png", content: "BBBB" },
{ type: "image", mimeType: "image/webp", fileName: "b.webp", content: "CCCC" },
],
}),
});

expect(saveMediaBufferMock).toHaveBeenCalledTimes(1);
expect(emitSessionTranscriptUpdateMock).toHaveBeenCalledTimes(1);
const transcriptCall = mockCallArg(emitSessionTranscriptUpdateMock);
expect(transcriptCall.message.MediaPaths).toEqual([
inlineSaved.path,
"/tmp/media/offloaded.bin",
]);
expect(transcriptCall.message.MediaTypes).toEqual(["image/png", "image/webp"]);
expect(transcriptCall.message.MediaPath).toBe(inlineSaved.path);
expect(transcriptCall.message.MediaType).toBe("image/png");
});

it("emits transcript without media fields when no attachments are present", async () => {
const ctx = buildCtx();
parseMessageWithAttachmentsMock.mockResolvedValueOnce({
message: "plain text",
images: [],
imageOrder: [],
offloadedRefs: [],
});

await handleNodeEvent(ctx, "ios-text-only", {
event: "agent.request",
payloadJSON: JSON.stringify({
message: "plain text",
sessionKey: "agent:main:main",
}),
});

expect(saveMediaBufferMock).not.toHaveBeenCalled();
expect(emitSessionTranscriptUpdateMock).toHaveBeenCalledTimes(1);
const transcriptCall = mockCallArg(emitSessionTranscriptUpdateMock);
expect(transcriptCall.message.MediaPath).toBeUndefined();
expect(transcriptCall.message.MediaPaths).toBeUndefined();
expect(transcriptCall.message.role).toBe("user");
expect(transcriptCall.message.content).toBe("plain text");
});
peterdsp added 2 commits May 26, 2026 18:13
The agent.request handler parses chat attachments but never persists
media to disk or emits transcript updates with MediaPath fields.
Images shared via the iOS Share Extension become orphaned in transcript
history because the inbound media path diverges from the chat.send
flow at the persistence boundary.

Wire saveMediaBuffer and emitSessionTranscriptUpdate into the
agent.request handler, replicating the persist-then-emit pattern
already established in chat.send. Inline base64 images are saved
to disk via saveMediaBuffer; offloaded refs are carried through
as-is. The resulting SavedMedia entries feed MediaPath, MediaPaths,
MediaType, and MediaTypes into the transcript update.

Closes openclaw#60339
- SavedMedia.contentType is optional; fall back to the known mimeType
  from the inbound image to satisfy SavedMediaEntry's required field
- Cast mockCallArg return to record type in tests to fix TS18046
@peterdsp peterdsp force-pushed the fix/agent-request-transcript-media-persistence branch from 48732cb to cf74ea1 Compare May 26, 2026 15:13
oxlint flags 'as Record<...>' casts as unnecessary. Switch to
expect().toMatchObject() which accepts unknown natively.
@openclaw-barnacle openclaw-barnacle Bot added proof: supplied External PR includes structured after-fix real behavior proof. and removed triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. labels May 26, 2026
@clawsweeper clawsweeper Bot added rating: 🌊 off-meta tidepool PR readiness rating does not apply to this item. and removed rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. labels May 29, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

gateway Gateway runtime merge-risk: 🚨 session-state 🚨 May lose, corrupt, stale, or mis-associate session, agent, or context state. P1 High-priority user-facing bug, regression, or broken workflow. proof: supplied External PR includes structured after-fix real behavior proof. rating: 🌊 off-meta tidepool PR readiness rating does not apply to this item. size: M

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: bug(gateway): offloadedRefs metadata lost in transcript for iOS share/node path

3 participants