Skip to content

Commit 9657b8e

Browse files
fix(image-generate): allow distinct active image requests (#83614)
Summary: - This PR prompt-scopes `image_generate` duplicate detection, adds same-prompt and distinct-prompt regression tests, and updates task guardrail docs and changelog. - Reproducibility: yes. Current-main source shows the duplicate guard runs before prompt parsing and active lookup ignores prompt identity, matching the linked distinct-second-image failure mode. Automerge notes: - PR branch already contained follow-up commit before automerge: docs(tasks): clarify image generation guardrail - PR branch already contained follow-up commit before automerge: fix(image-generate): allow distinct active image requests Validation: - ClawSweeper review passed for head 9f19a96. - Required merge gates passed before the squash merge. Prepared head SHA: 9f19a96 Review: #83614 (comment) Co-authored-by: Elarwei <elarweis@gmail.com> Co-authored-by: clawsweeper <274271284+clawsweeper[bot]@users.noreply.github.com> Co-authored-by: clawsweeper[bot] <274271284+clawsweeper[bot]@users.noreply.github.com>
1 parent fffb8c9 commit 9657b8e

9 files changed

Lines changed: 212 additions & 3 deletions

CHANGELOG.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -42,6 +42,7 @@ Docs: https://docs.openclaw.ai
4242

4343
### Fixes
4444

45+
- Agents/image generation: allow distinct `image_generate` prompts to start separate session-backed background tasks while same-prompt retries still return the active task status. (#83614) Thanks @Elarwei001.
4546
- Sessions: skip trailing custom transcript entries when checking tail assistant replies so embedded CLI gap-fill does not duplicate canonical assistant output. (#83635) Thanks @yaoyi1222.
4647
- Telegram: keep verbose tool progress visible without mirroring non-final progress into active session transcripts, preventing embedded provider replies from aborting mid-run. (#83631) Thanks @kurplunkin.
4748
- Cron: link isolated scheduled task runs to their stable cron session so task status and cleanup can follow the backing agent run. (#83606) Thanks @jai.

docs/automation/tasks.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -106,7 +106,7 @@ Not every agent run creates a task. Heartbeat turns and normal interactive chat
106106

107107
</Accordion>
108108
<Accordion title="Concurrent media-generation guardrail">
109-
While a session-backed media-generation task is still active, the tool also acts as a guardrail: repeated `image_generate`, `music_generate`, or `video_generate` calls in that same session return the active task status instead of starting a second concurrent generation. Use `action: "status"` when you want an explicit progress/status lookup from the agent side.
109+
While a session-backed media-generation task is still active, media tools also act as guardrails for accidental retries. Repeated `image_generate` calls for the same prompt return the matching active task status, while a distinct image prompt can start its own task. `music_generate` and `video_generate` calls still return the active task status for that session instead of starting a second concurrent generation. Use `action: "status"` when you want an explicit progress/status lookup from the agent side.
110110
</Accordion>
111111
<Accordion title="What does not create tasks">
112112
- Heartbeat turns - main-session; see [Heartbeat](/gateway/heartbeat)

src/agents/image-generation-task-status.test.ts

Lines changed: 44 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -115,6 +115,50 @@ describe("image generation task status", () => {
115115
expect(details.progressSummary).toBe("Generating image");
116116
});
117117

118+
it("can restrict active lookup to the matching image prompt", () => {
119+
taskRuntimeInternalMocks.listTasksForOwnerKey.mockReturnValue([
120+
{
121+
taskId: "task-first",
122+
runtime: "cli",
123+
taskKind: IMAGE_GENERATION_TASK_KIND,
124+
sourceId: "image_generate:openai",
125+
requesterSessionKey: "agent:main",
126+
ownerKey: "agent:main",
127+
scopeKind: "session",
128+
task: "First diagram prompt",
129+
status: "running",
130+
deliveryStatus: "not_applicable",
131+
notifyPolicy: "silent",
132+
createdAt: Date.now(),
133+
},
134+
{
135+
taskId: "task-second",
136+
runtime: "cli",
137+
taskKind: IMAGE_GENERATION_TASK_KIND,
138+
sourceId: "image_generate:openai",
139+
requesterSessionKey: "agent:main",
140+
ownerKey: "agent:main",
141+
scopeKind: "session",
142+
task: "Second diagram prompt",
143+
status: "running",
144+
deliveryStatus: "not_applicable",
145+
notifyPolicy: "silent",
146+
createdAt: Date.now(),
147+
},
148+
]);
149+
150+
expect(
151+
findActiveImageGenerationTaskForSession("agent:main", {
152+
prompt: "Second diagram prompt",
153+
})?.taskId,
154+
).toBe("task-second");
155+
expect(
156+
findActiveImageGenerationTaskForSession("agent:main", {
157+
prompt: "Third diagram prompt",
158+
}),
159+
).toBeUndefined();
160+
});
161+
118162
it("builds prompt context for active session work", () => {
119163
taskRuntimeInternalMocks.listTasksForOwnerKey.mockReturnValue([
120164
{

src/agents/image-generation-task-status.ts

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,11 +24,13 @@ export function getImageGenerationTaskProviderId(task: TaskRecord): string | und
2424

2525
export function findActiveImageGenerationTaskForSession(
2626
sessionKey?: string,
27+
params?: { prompt?: string },
2728
): TaskRecord | undefined {
2829
return findActiveMediaGenerationTaskForSession({
2930
sessionKey,
3031
taskKind: IMAGE_GENERATION_TASK_KIND,
3132
sourcePrefix: IMAGE_GENERATION_SOURCE_PREFIX,
33+
taskLabel: params?.prompt,
3234
});
3335
}
3436

src/agents/media-generation-task-status-shared.ts

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -33,11 +33,13 @@ export function findActiveMediaGenerationTaskForSession(params: {
3333
sessionKey?: string;
3434
taskKind: string;
3535
sourcePrefix: string;
36+
taskLabel?: string;
3637
}): TaskRecord | undefined {
3738
return findActiveSessionTask({
3839
sessionKey: params.sessionKey,
3940
runtime: "cli",
4041
taskKind: params.taskKind,
42+
task: params.taskLabel,
4143
sourceIdPrefix: params.sourcePrefix,
4244
});
4345
}

src/agents/session-async-task-status.ts

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,7 @@ export function findActiveSessionTask(params: {
88
sessionKey?: string;
99
runtime?: TaskRuntime;
1010
taskKind?: string;
11+
task?: string;
1112
statuses?: ReadonlySet<TaskStatus>;
1213
sourceIdPrefix?: string;
1314
}): TaskRecord | undefined {
@@ -17,6 +18,7 @@ export function findActiveSessionTask(params: {
1718
}
1819
const statuses = params.statuses ?? DEFAULT_ACTIVE_STATUSES;
1920
const taskKind = normalizeOptionalString(params.taskKind);
21+
const taskLabel = normalizeOptionalString(params.task);
2022
const sourceIdPrefix = normalizeOptionalString(params.sourceIdPrefix);
2123
const matches = listTasksForOwnerKey(normalizedSessionKey).filter((task) => {
2224
if (task.scopeKind !== "session") {
@@ -31,6 +33,12 @@ export function findActiveSessionTask(params: {
3133
if (taskKind && task.taskKind !== taskKind) {
3234
return false;
3335
}
36+
if (taskLabel) {
37+
const currentTaskLabel = normalizeOptionalString(task.task);
38+
if (currentTaskLabel !== taskLabel) {
39+
return false;
40+
}
41+
}
3442
if (sourceIdPrefix) {
3543
const sourceId = normalizeOptionalString(task.sourceId) ?? "";
3644
if (sourceId !== sourceIdPrefix && !sourceId.startsWith(`${sourceIdPrefix}:`)) {

src/agents/tools/image-generate-tool.actions.ts

Lines changed: 20 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -92,6 +92,25 @@ export function createImageGenerateStatusActionResult(
9292

9393
export function createImageGenerateDuplicateGuardResult(
9494
sessionKey?: string,
95+
params?: { prompt?: string },
9596
): ImageGenerateActionResult | undefined {
96-
return imageGenerateTaskStatusActions.createDuplicateGuardResult(sessionKey);
97+
const activeTask = findActiveImageGenerationTaskForSession(sessionKey, {
98+
prompt: params?.prompt,
99+
});
100+
if (!activeTask) {
101+
return undefined;
102+
}
103+
return {
104+
content: [
105+
{
106+
type: "text",
107+
text: buildImageGenerationTaskStatusText(activeTask, { duplicateGuard: true }),
108+
},
109+
],
110+
details: {
111+
action: "status",
112+
duplicateGuard: true,
113+
...buildImageGenerationTaskStatusDetails(activeTask),
114+
},
115+
};
97116
}

src/agents/tools/image-generate-tool.test.ts

Lines changed: 132 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,17 @@
11
import { afterEach, beforeAll, beforeEach, describe, expect, it, vi } from "vitest";
22

3+
const taskRuntimeInternalMocks = vi.hoisted(() => ({
4+
listTasksForOwnerKey: vi.fn(),
5+
}));
6+
37
const taskRuntimeMocks = vi.hoisted(() => ({
48
createRunningTaskRun: vi.fn(),
59
recordTaskRunProgressByRunId: vi.fn(),
610
completeTaskRunByRunId: vi.fn(),
711
failTaskRunByRunId: vi.fn(),
812
}));
913

14+
vi.mock("../../tasks/runtime-internal.js", () => taskRuntimeInternalMocks);
1015
vi.mock("../../tasks/detached-task-runtime.js", () => taskRuntimeMocks);
1116

1217
let imageGenerationRuntime: typeof import("../../image-generation/runtime.js");
@@ -304,6 +309,8 @@ describe("createImageGenerateTool", () => {
304309
taskRuntimeMocks.recordTaskRunProgressByRunId.mockReset();
305310
taskRuntimeMocks.completeTaskRunByRunId.mockReset();
306311
taskRuntimeMocks.failTaskRunByRunId.mockReset();
312+
taskRuntimeInternalMocks.listTasksForOwnerKey.mockReset();
313+
taskRuntimeInternalMocks.listTasksForOwnerKey.mockReturnValue([]);
307314
});
308315

309316
afterEach(() => {
@@ -736,6 +743,131 @@ describe("createImageGenerateTool", () => {
736743
);
737744
});
738745

746+
it("allows a distinct image request while another image generation task is active", async () => {
747+
stubImageGenerationProviders();
748+
vi.stubEnv("OPENAI_API_KEY", "openai-test");
749+
vi.spyOn(imageGenerationRuntime, "generateImage").mockResolvedValue({
750+
provider: "openai",
751+
model: "gpt-image-1",
752+
attempts: [],
753+
ignoredOverrides: [],
754+
images: [
755+
{
756+
buffer: Buffer.from("png-out"),
757+
mimeType: "image/png",
758+
fileName: "second.png",
759+
},
760+
],
761+
});
762+
taskRuntimeInternalMocks.listTasksForOwnerKey.mockReturnValue([
763+
{
764+
taskId: "task-first-image",
765+
runtime: "cli",
766+
taskKind: "image_generation",
767+
sourceId: "image_generate:openai",
768+
requesterSessionKey: "agent:main:discord:direct:123",
769+
ownerKey: "agent:main:discord:direct:123",
770+
scopeKind: "session",
771+
task: "First diagram prompt",
772+
status: "running",
773+
deliveryStatus: "not_applicable",
774+
notifyPolicy: "silent",
775+
createdAt: Date.now(),
776+
},
777+
]);
778+
taskRuntimeMocks.createRunningTaskRun.mockReturnValue({
779+
taskId: "task-second-image",
780+
});
781+
const scheduled: Array<() => Promise<void>> = [];
782+
const tool = requireImageGenerateTool(
783+
createImageGenerateTool({
784+
config: {
785+
agents: {
786+
defaults: {
787+
imageGenerationModel: {
788+
primary: "openai/gpt-image-1",
789+
},
790+
},
791+
},
792+
},
793+
agentDir: "/tmp/agent",
794+
agentSessionKey: "agent:main:discord:direct:123",
795+
requesterOrigin: {
796+
channel: "discord",
797+
to: "dm:123",
798+
},
799+
scheduleBackgroundWork: (work) => {
800+
scheduled.push(work);
801+
},
802+
}),
803+
);
804+
805+
const result = await tool.execute("call-second", {
806+
prompt: "Second diagram prompt",
807+
filename: "second.png",
808+
model: "openai/gpt-image-1",
809+
});
810+
811+
expect(scheduled).toHaveLength(1);
812+
expect(resultDetails(result).taskId).toBe("task-second-image");
813+
expect(taskRuntimeMocks.createRunningTaskRun).toHaveBeenCalledWith(
814+
expect.objectContaining({
815+
task: "Second diagram prompt",
816+
}),
817+
);
818+
});
819+
820+
it("returns active status for a duplicate image request with the same prompt", async () => {
821+
stubImageGenerationProviders();
822+
vi.stubEnv("OPENAI_API_KEY", "openai-test");
823+
taskRuntimeInternalMocks.listTasksForOwnerKey.mockReturnValue([
824+
{
825+
taskId: "task-existing-image",
826+
runtime: "cli",
827+
taskKind: "image_generation",
828+
sourceId: "image_generate:openai",
829+
requesterSessionKey: "agent:main:discord:direct:123",
830+
ownerKey: "agent:main:discord:direct:123",
831+
scopeKind: "session",
832+
task: "Same diagram prompt",
833+
status: "running",
834+
deliveryStatus: "not_applicable",
835+
notifyPolicy: "silent",
836+
createdAt: Date.now(),
837+
progressSummary: "Generating image",
838+
},
839+
]);
840+
const tool = requireImageGenerateTool(
841+
createImageGenerateTool({
842+
config: {
843+
agents: {
844+
defaults: {
845+
imageGenerationModel: {
846+
primary: "openai/gpt-image-1",
847+
},
848+
},
849+
},
850+
},
851+
agentDir: "/tmp/agent",
852+
agentSessionKey: "agent:main:discord:direct:123",
853+
}),
854+
);
855+
856+
const result = await tool.execute("call-duplicate", {
857+
prompt: "Same diagram prompt",
858+
filename: "same.png",
859+
model: "openai/gpt-image-1",
860+
});
861+
862+
expect(taskRuntimeMocks.createRunningTaskRun).not.toHaveBeenCalled();
863+
expect(resultText(result)).toContain(
864+
"Image generation task task-existing-image is already running",
865+
);
866+
const details = resultDetails(result);
867+
expect(details.duplicateGuard).toBe(true);
868+
expect(details.task).toEqual({ taskId: "task-existing-image" });
869+
});
870+
739871
it("uses configured timeoutMs for image generation and lets calls override it", async () => {
740872
stubImageGenerationProviders();
741873
const generateImage = vi.spyOn(imageGenerationRuntime, "generateImage").mockResolvedValue({

src/agents/tools/image-generate-tool.ts

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -819,15 +819,16 @@ export function createImageGenerateTool(options?: {
819819
const effectiveCfg =
820820
applyImageGenerationModelConfigDefaults(cfg, imageGenerationModelConfig) ?? cfg;
821821
const remoteMediaSsrfPolicy = resolveRemoteMediaSsrfPolicy(effectiveCfg);
822+
const prompt = readStringParam(params, "prompt", { required: true });
822823

823824
const duplicateGuardResult = createImageGenerateDuplicateGuardResult(
824825
options?.agentSessionKey,
826+
{ prompt },
825827
);
826828
if (duplicateGuardResult) {
827829
return duplicateGuardResult;
828830
}
829831

830-
const prompt = readStringParam(params, "prompt", { required: true });
831832
const imageInputs = normalizeReferenceImages(params);
832833
const model = readStringParam(params, "model");
833834
const filename = readStringParam(params, "filename");

0 commit comments

Comments
 (0)