This document lists the local HTTP API exposed by ima2 serve.
Base URL:
http://localhost:3333
Image generation supports OAuth, API-key, Grok, and Gemini (agy and gemini-api) providers.
provider: "oauth"uses the local Codex OAuth proxy.provider: "api"uses the OpenAI Responses API with the hostedimage_generationtool.provider: "grok"uses the bundled progrok xAI proxy. Classic, Node, and Agent generation run mandatory xAI Web Search through/v1/responses, then run agrok-4.3planner call with a forced localgenerate_imagefunction, then ima2 executes xAI/v1/images/generations. If reference images, a Node parent image, or an Agent current image are attached, the final step switches to xAI/v1/images/editsso image-to-image context is preserved.provider: "agy"spawns the Antigravity CLI (agy -p) to generate images via Google Gemini'sdefault_api:generate_imagetool. Model isnano-banana-2. Output is fixed at 1024×1024 JPEG. Max 3 reference images (i2i). No web search, quality, size, or mask controls. Multimode returns a single image. Video is unsupported (AGY_VIDEO_UNSUPPORTED).provider: "grok-api"uses a direct xAI API key instead of the bundled progrok OAuth proxy. Same pipeline asgrok(Web Search → planner →/v1/images/generations), same aspect ratio and resolution options. Requires an xAI API key configured via the web UI key management orXAI_API_KEYenv var. Also supports video generation.provider: "gemini-api"calls the Google Generative Language API directly (or Vertex AI with a service account JSON). Supports modelsnano-banana-2(Gemini 3.1 Flash Image) andnano-banana-pro(Gemini 3 Pro Image). Supports variable aspect ratios (1:1 through 21:9) and four resolution tiers (512px, 1K, 2K, 4K) on both auth paths — the direct API path sendsgeneration_config.response_format.image(snake_case) while the Vertex AI endpoint (aiplatform.googleapis.com) sendsgenerationConfig.imageConfig(camelCase). Withsize: "auto"the image config is omitted entirely and the model decides ratio/size. Auth:GEMINI_API_KEYenv var, web UI key management (/api/keys/gemini), or a Vertex AI service account JSON (VERTEX_SERVICE_ACCOUNT_JSONor/api/keys/vertex). When both Vertex credentials and an API key are configured, Vertex takes priority. The chosen auth mode (apikeyorvertex) persists to~/.ima2/config.jsonasgeminiAuthModeand is restored on server startup. Per-model cost:nano-banana-2(Flash): 512=$0.001, 1K=$0.003, 2K=$0.004, 4K=$0.006;nano-banana-pro: 1K=$0.007, 2K=$0.007, 4K=$0.013. No web search or mask controls.- API-key generation covers classic generate, edit, mask-guided edit, multimode, and node generation.
- If
provider: "api"is requested without an API key, routes fail before upstream with401andAPI_KEY_REQUIRED. - Grok generation maps
sizeto xAIaspect_ratioandresolution; it does not send an OpenAI-stylesizefield upstream. Grok edit uses xAI/v1/images/edits; Grok mask edit remains unsupported and returnsGROK_MASK_UNSUPPORTED. - Mask edits are mask/selection guided edits, not pixel-perfect inpaint guarantees.
Grok video generation uses POST /api/video/generate (SSE). See the Video
Generation section below for the full endpoint specification.
| Method | Path | Notes |
|---|---|---|
GET |
/api/health |
Server health, version, paths, provider policy |
GET |
/api/providers |
Provider availability and runtime ports |
GET |
/api/oauth/status |
OAuth proxy status and visible models |
GET |
/api/grok/status |
Bundled progrok status and visible xAI image models |
GET |
/api/billing |
Billing/status probe, including API key source when configured |
GET |
/api/quota |
Provider quota: returns { codex, grok }. Grok result includes billing: { usedUsd, limitUsd } and a monthly percent window drawn from the xAI billing API. |
| Method | Path | Notes |
|---|---|---|
POST |
/api/auth/switch |
Start a device-code OAuth flow. Body: { "provider": "grok" | "codex" }. Returns { sessionId, userCode, verificationUrl }. |
GET |
/api/auth/switch/:sessionId |
Poll switch-account session status. Returns { status } where status is pending, complete, error, or expired. |
The Switch Account flow opens a browser verification URL. Once the user completes the device-code step, the server saves the new credentials (Grok: ~/.progrok/auth.json; Codex: via codex login --device-auth) and the session transitions to complete. This endpoint is surfaced as a Switch Account button in the Settings QuotaCard for Grok and Codex providers.
| Method | Path | Notes |
|---|---|---|
GET |
/api/storage/status |
Summarized gallery storage status for support UI |
POST |
/api/storage/open-generated-dir |
Ask the server process to open the generated image folder |
GET /api/storage/status returns a support-safe summary, not raw legacy path arrays by default.
{
"ok": true,
"data": {
"generatedDirLabel": "~/.ima2/generated",
"generatedCount": 0,
"legacyCandidatesScanned": 18,
"legacySourcesFound": 0,
"legacyFilesFound": 0,
"state": "not_found",
"messageKind": "apology",
"recoveryDocsPath": "docs/RECOVER_OLD_IMAGES.md",
"doctorCommand": "ima2 doctor",
"overrides": {
"generatedDir": false,
"configDir": false
}
}
}Storage state values:
| State | Meaning |
|---|---|
ok |
Current gallery has files or no recovery notice is needed |
recoverable |
Legacy folders/files are still present and may be recoverable |
not_found |
Current gallery is empty and no legacy folder was found |
unknown |
Storage status inspection failed or was incomplete |
POST /api/storage/open-generated-dir opens the generated image folder on the machine running ima2 serve. If the browser is connected to a remote server, VM, container, WSL instance, or another computer on the network, this action targets that server machine, not necessarily the browser device.
| Method | Path | Notes |
|---|---|---|
GET |
/api/inflight |
Active jobs only by default |
GET |
/api/inflight?includeTerminal=1 |
Includes recent terminal jobs for debugging |
DELETE |
/api/inflight/:requestId |
Cancel or forget an active job |
GET |
/api/events |
Persistent SSE multiplex channel for all async generation progress (see below) |
In-flight logs and responses use requestId for correlation. Logs should not include raw prompts, reference data URLs, generated base64, tokens, cookies, auth headers, or raw upstream bodies.
Single persistent Server-Sent Events channel that carries progress for all async generation jobs. The browser UI opens one EventSource here instead of holding a per-request SSE connection for each job, avoiding browser per-origin connection limits.
| Query | Notes |
|---|---|
lastEventId |
Optional. Reconnect cursor; also accepted via the Last-Event-ID request header |
Response: text/event-stream (persistent). Each frame uses standard SSE fields id, event, and data (JSON).
Connection limits: When active listeners reach 512, the server returns 503 with SSE_CAPACITY before opening the stream.
Heartbeat: Every 15 seconds the server writes a comment frame:
: ping
Replay: On reconnect, the server replays events from an in-memory ring buffer (size 2000) for IDs newer than lastEventId. Large image payloads (>1000 characters) are omitted from replay with _imageOmitted: true in the data payload. If the requested ID is older than the oldest buffered event, the server emits a replay-gap event before live fan-out:
| Event | Data | Description |
|---|---|---|
replay-gap |
{ lastEventId, oldestAvailableId } |
Client should reconcile inflight state (for example via GET /api/inflight) |
Job routing: Every data payload includes jobId (same value as the job's requestId). Event bodies also carry requestId where applicable. Clients filter events by matching data.jobId or data.requestId to the job they started.
Event types (fan-out to all connected clients):
| Event | Emitted by | Description |
|---|---|---|
phase |
node, multimode, video | Lifecycle phase change |
partial |
node, multimode | Progressive preview image (base64 data URL) |
image |
multimode | Final saved GenerateItem for one sequence image |
done |
node, multimode, video | Terminal success payload (route-specific shape) |
error |
all generation routes | Terminal failure |
submitted |
video | Job submitted to xAI |
progress |
video | Progress fraction 0.0–1.0 |
planning |
video | Video planner running |
Example SSE frame:
id: 42
event: phase
data: {"requestId":"req_abc","jobId":"req_abc","phase":"streaming"}
POST /api/node/generate, POST /api/generate/multimode, and POST /api/video/generate support an async POST mode for clients that already hold GET /api/events:
{
"async": true,
"requestId": "req_xxx",
"...": "other route fields"
}| Outcome | HTTP | Body |
|---|---|---|
| Accepted | 202 |
{ "requestId": "req_xxx" } |
Duplicate active requestId |
409 |
REQUEST_ID_IN_USE |
| More than the configured concurrent active job limit | 429 |
TOO_MANY_JOBS with Retry-After: 5; default limit is 24 via IMA2_MAX_PARALLEL |
Progress events are published on GET /api/events. The POST response returns immediately; clients must not expect SSE on the POST connection when async: true.
CLI and legacy clients omit async and keep the original behavior: per-request SSE on the same POST response (Accept: text/event-stream where applicable). The server dual-emits in that mode — it writes SSE to the POST response and also publishes the same events on GET /api/events.
Text-to-image and reference-guided root generation.
{
"prompt": "a shiba in space",
"quality": "medium",
"size": "1024x1024",
"format": "png",
"moderation": "low",
"provider": "oauth",
"model": "gpt-5.4",
"references": [],
"requestId": "optional-client-id",
"storyboard": false
}Supported quality values: low, medium, high.
Supported moderation values: auto, low.
When storyboard is true, the server prepends storyboard keyframe instructions so image
generations maintain character and scene continuity for multi-shot video production.
Recommended model: gpt-5.4. Current app default: gpt-5.4-mini. gpt-5.5 is the strongest quality option when supported, but callers should expect higher quota pressure and possible Codex CLI/backend capability requirements.
When provider is "grok", supported models are grok-imagine-image and
grok-imagine-image-quality. The server uses grok-4.3 as the search/planner
model by default (IMA2_GROK_PLANNER_MODEL) and times the mandatory search and
planner steps separately from the image call (IMA2_GROK_PLANNER_TIMEOUT_MS).
For n > 1, search and planning run once and the planned prompt is reused for
the image requests. Successful Grok classic generations report one mandatory
web-search call in metadata.
If references are present on a Grok classic request, ima2 still performs the
mandatory search and grok-4.3 planning phases. The planner receives the
reference images as multimodal image_url inputs, and its forced
generate_image.prompt argument is instructed to be English-only except for
exact visible text requested by the user. The final image call then uses xAI
/v1/images/edits with the same reference images instead of
/v1/images/generations. This keeps image-to-image/reference context alive
through the three-phase pipeline. xAI currently documents up to three source
images for image editing, so Grok classic requests with more than three
references return GROK_REF_TOO_MANY.
Grok size mapping:
| Requested size | xAI aspect_ratio |
xAI resolution |
|---|---|---|
1024x1024 |
1:1 |
1k |
1536x1024 |
3:2 |
1k |
1024x1536 |
2:3 |
1k |
1360x1024 |
4:3 |
1k |
1024x1360 |
3:4 |
1k |
1824x1024 |
16:9 |
1k |
1024x1824 |
9:16 |
1k |
2048x2048 |
1:1 |
2k |
2048x1152 |
16:9 |
2k |
1152x2048 |
9:16 |
2k |
3840x2160 |
16:9 |
2k |
2160x3840 |
9:16 |
2k |
auto |
auto |
omitted |
Custom sizes are reduced to the closest xAI-supported aspect ratio and use
2k when the requested longest edge or pixel budget is closer to a 2K image.
Image edit / image-to-image generation.
The request includes a prompt and image payload. provider: "api" sends the prompt and image through the shared Responses image adapter. Optional masks are forwarded as mask guidance, not a pixel-perfect edit guarantee.
With provider: "grok", edit requests are sent to xAI /v1/images/edits
through the bundled progrok proxy. Masked Grok edits are rejected before
upstream with GROK_MASK_UNSUPPORTED.
Grok multimode currently sends each image request directly to xAI Images API
with the mapped aspect_ratio/resolution; the mandatory search + planner
pipeline is limited to classic /api/generate.
Node-mode generation and child edits.
Body fields:
{
"parentNodeId": "optional-server-node-id",
"prompt": "continue this image",
"quality": "medium",
"size": "1024x1024",
"format": "png",
"moderation": "low",
"model": "grok-imagine-image",
"references": [],
"externalSrc": "optional-history-url",
"sessionId": "session-id",
"clientNodeId": "client-node-id",
"requestId": "request-id",
"provider": "grok"
}When parentNodeId is present, the server loads the stored parent node image and uses the edit path. Node-local references are allowed on both root and child/edit nodes; for child/edit nodes the parent image is sent first, then references, then the text prompt.
With provider: "grok", Node Mode uses the same xAI search + grok-4.3 planner + Images API pipeline as classic generation. A parent node image, externalSrc, or extra references are passed to the planner and then to xAI /v1/images/edits; otherwise the final call uses /v1/images/generations. Grok Node requests are capped at three total input images, counting the parent/current image plus references, and return GROK_REF_TOO_MANY before upstream when that limit is exceeded. quality: "high" promotes the final image model to grok-imagine-image-quality.
The route can stream Server-Sent Events when the client sends Accept: text/event-stream. Possible events include phase, partial, done, and error. Alternatively, send { "async": true, "requestId": "req_xxx" } in the body to receive 202 { requestId } immediately and follow progress on GET /api/events (see Events section).
Grok Node SSE responses do not include Responses API partial image events because the xAI Images API call is synchronous JSON. They still emit phase and done/error events so the Node UI can use the same in-flight lifecycle.
Multi-image sequence generation. SSE-only on the POST response unless async mode is used.
{
"prompt": "a story in four panels",
"maxImages": 4,
"quality": "medium",
"size": "1024x1024",
"format": "png",
"moderation": "low",
"model": "gpt-5.4",
"provider": "oauth",
"references": [],
"requestId": "optional-client-id",
"async": false
}Send Accept: text/event-stream for per-request SSE on the POST connection. Or set "async": true with a client requestId to get 202 { requestId } and receive events on GET /api/events.
SSE events:
| Event | Data | Description |
|---|---|---|
phase |
{ requestId, phase, sequenceId?, maxImages? } |
Lifecycle phase |
partial |
{ requestId, image, index } |
Progressive preview |
image |
full GenerateItem |
One saved sequence image |
done |
route-specific summary; may include status: "partial" after timeout if at least one image was saved |
Sequence complete |
error |
{ requestId, error, code?, status? } |
Generation failed |
Fetch stored node metadata and asset URL.
Reference uploads are capped at 5 items. The frontend compresses large JPEG/PNG files before sending them. HEIC/HEIF files are rejected with a user-facing conversion hint.
Server-side validation may return these reference codes:
| Code | Meaning |
|---|---|
REF_NOT_ARRAY |
references was not an array |
REF_TOO_MANY |
More than the configured reference count |
REF_NOT_STRING |
A reference item was not a string |
REF_EMPTY |
A reference item was empty |
REF_TOO_LARGE |
A reference exceeded the configured base64 size |
REF_NOT_BASE64 |
A reference was not valid base64 |
GROK_REF_TOO_MANY |
Grok classic generation received more than three reference images |
GROK_MASK_UNSUPPORTED |
Grok edit was requested with a mask; xAI mask edit is not wired in this release |
Generate a video via the Grok video provider. Returns Server-Sent Events on the POST connection, or accepts async mode ({ "async": true, "requestId": "req_xxx" }) for 202 { requestId } with progress on GET /api/events (see Events section).
{
"prompt": "a cat playing piano",
"provider": "grok",
"model": "grok-imagine-video",
"duration": 5,
"resolution": "480p",
"aspectRatio": "auto",
"sourceImage": "<base64>",
"referenceImages": ["<base64>", "<base64>"],
"referenceFilenames": ["existing-file.png"],
"continueFromVideo": "1780226256355_50252101.mp4",
"continuityLineage": { "lineageId": "optional-client-hint", "entries": [] },
"sessionId": "optional",
"requestId": "optional-client-id"
}Models: grok-imagine-video (default), grok-imagine-video-1.5-preview.
Mode is auto-detected from reference inputs:
| Inputs | Mode | Duration cap |
|---|---|---|
| No images | text-to-video | 1–15s |
1 image (sourceImage or sourceFilename) |
image-to-video | 1–15s |
2–7 images (referenceImages / referenceFilenames) |
reference-to-video | 1–10s |
Parameters:
| Field | Type | Default | Notes |
|---|---|---|---|
prompt |
string | — | Required |
provider |
string | "grok" |
"grok" or "grok-api" |
model |
string | grok-imagine-video |
Video model |
duration |
integer | 5 |
1–15 seconds (clamped to 10 for reference-to-video) |
resolution |
string | "480p" |
480p or 720p |
aspectRatio |
string | "auto" |
1:1, 16:9, 9:16, 4:3, 3:4, 3:2, 2:3, auto |
sourceImage |
string | — | Base64 image for image-to-video |
sourceFilename |
string | — | Existing generated file for image-to-video |
referenceImages |
string[] | — | Base64 images for reference-to-video |
referenceFilenames |
string[] | — | Existing generated files for reference-to-video |
continueFromVideo |
string | — | Generated .mp4 parent; server extracts its last frame and rebuilds lineage from sidecar |
continuityLineage |
object | — | Optional client hint; used only when continueFromVideo is absent |
plannerModel |
string | grok-4.3 |
Grok video planner model override (also via settings UI or IMA2_GROK_PLANNER_MODEL) |
storyboard |
boolean | false |
Enable storyboard mode — maintains character/scene continuity across sequential clips |
Blank prompts return PROMPT_REQUIRED with a guidance string. The active
prompt should describe visual flow, motion flow, sound/music/no-music,
dialogue/no-dialogue, ending frame, and duration pacing. The video planner uses
the selected duration as the full clip runtime and expands short requests into a
production-level sequence with opening composition, connected motion/emotion
change, and a stable ending frame suitable for continuation. For multi-character
scenes, the planner identifies speakers by visual appearance (clothing, physique,
position, props) rather than names, and attributes each dialogue line accordingly.
When continueFromVideo is present, the server treats the generated .mp4
sidecar as authoritative. Client continuityLineage cannot override it. The
saved child sidecar includes videoContinuity, a branch-local max-4 stack using
keep-start-plus-latest-3 retention.
videoContinuity shape:
{
"lineageId": "lineage:parent",
"parentFilename": "parent.mp4",
"sourceFrame": "last",
"maxEntries": 4,
"retention": "keep-start-plus-latest-3",
"entries": [
{
"id": "clip:parent.mp4",
"ordinal": 1,
"role": "start",
"filename": "parent.mp4",
"userPrompt": "original user prompt",
"revisedPrompt": "planner prompt actually sent to Grok video",
"createdAt": 1780300000000
}
]
}Entry role is start, ancestor, parent, or current. The first clip is
kept as the start anchor; later generations keep only the latest three entries.
lineageId uses the generated video basename without the .mp4 extension.
This metadata is stored in the generated .mp4.json sidecar and returned in
history rows and video done events; /generated/*.json remains private.
Grok prompt surfaces used by video APIs:
| Surface | Model | Responsibility |
|---|---|---|
| Video planner | grok-4.3 (override via plannerModel) |
Converts user prompt, search context, refs, and optional continuity lineage into the final English video prompt. It must structure core subject, action/motion, camera/composition, environment/style, dialogue/audio, ending-frame handoff, and constraints. Multi-character dialogue uses appearance-based speaker identification. |
| Video generation | xAI video model | Receives the planner prompt plus sourceImage or referenceImages when present. |
| Video analysis | grok-4.3 |
Reads first/last frame images from /api/video/analyze and returns recreation/continuation guidance. |
SSE events:
| Event | Data | Description |
|---|---|---|
planning |
{ requestId } |
Preparing video generation |
submitted |
{ requestId, xaiVideoRequestId, requestedModel, effectiveModel, modelFallback } |
Submitted to xAI |
progress |
{ requestId, progress, stalled } |
Progress 0.0–1.0 |
done |
{ requestId, filename, url, mediaType, revisedPrompt, elapsed, usage, requestedModel, effectiveModel, modelFallback, video, videoContinuity } |
Video ready |
error |
{ error, code, status, requestId, guidance? } |
Generation failed |
Video error codes:
| Code | Meaning |
|---|---|
VIDEO_PROVIDER_UNSUPPORTED |
Provider is not "grok" |
PROMPT_REQUIRED |
Empty or missing prompt |
INVALID_GROK_VIDEO_MODEL |
Model not in valid set |
INVALID_VIDEO_RESOLUTION |
Resolution not 480p or 720p |
INVALID_VIDEO_ASPECT_RATIO |
Aspect ratio not in valid set |
INVALID_VIDEO_DURATION |
Duration not 1–15 integer |
GROK_VIDEO_REF_TOO_MANY |
More than 7 reference images |
GROK_VIDEO_FAILED |
Upstream xAI video generation failed |
GROK_VIDEO_FRAME_FAILED |
Server could not extract the parent video's last frame |
Edit an existing video via Grok V2V. This is a blocking JSON endpoint that starts the xAI edit job, polls it, downloads the final MP4, and saves it as a generated video artifact.
{
"prompt": "make it sunset",
"videoUrl": "https://vidgen.x.ai/.../clip.mp4",
"model": "grok-imagine-video"
}videoUrl may be an HTTPS video URL, xAI file_id, data:video/* URL, or generated .mp4 filename. Generated-file inputs are restricted to real .mp4 files under the generated directory.
Extend a video from its last frame. This is a blocking JSON endpoint that starts the xAI extension job, polls it, downloads the combined output MP4, and saves it as a generated video artifact.
{
"prompt": "camera pulls back",
"videoUrl": "1780226256355_50252101.mp4",
"duration": 6,
"model": "grok-imagine-video"
}duration must be an integer from 2 to 10 seconds. Edit and extension support grok-imagine-video only; grok-imagine-video-1.5-preview is not accepted for these endpoints.
Extract a PNG frame from a generated .mp4 file.
| Query | Notes |
|---|---|
file |
Required generated .mp4 filename or generated-dir absolute path |
position |
last (default) or non-negative seconds |
Analyze first and last frames from a generated .mp4 using Grok 4.3 image understanding. This does not upload the video as temporal video; it extracts two PNG frames and asks the vision model to infer likely motion.
{
"videoUrl": "1780226256355_50252101.mp4"
}Remote URLs and data: inputs are intentionally rejected to avoid server-side URL fetching through ffmpeg.
| Method | Path | Notes |
|---|---|---|
GET |
/api/history |
List generated assets |
GET |
/api/history?groupBy=session |
Group assets by session title |
DELETE |
/api/history/:filename |
Tombstone a generated asset |
POST |
/api/history/:filename/restore |
Restore a recently deleted asset |
History rows can include node metadata such as sessionId, nodeId, clientNodeId, requestId, and refsCount.
| Method | Path | Notes |
|---|---|---|
GET |
/api/sessions |
List graph sessions |
POST |
/api/sessions |
Create a session |
GET |
/api/sessions/:id |
Load a session and graph |
PATCH |
/api/sessions/:id |
Rename a session |
DELETE |
/api/sessions/:id |
Delete a session |
PUT |
/api/sessions/:id/graph |
Save graph snapshot |
PUT /api/sessions/:id/graph requires an If-Match header containing the current graph version.
Version mismatch returns GRAPH_VERSION_CONFLICT and the current version. This only means the client saved against a stale graph version; it is not proof that another browser tab changed the graph.
Graph save requests may include observability headers:
X-Ima2-Graph-Save-Id
X-Ima2-Graph-Save-Reason
X-Ima2-Tab-Id
| Method | Path | Notes |
|---|---|---|
GET |
/api/sessions/:id/style-sheet |
Load session style sheet |
PUT |
/api/sessions/:id/style-sheet |
Save style sheet |
PATCH |
/api/sessions/:id/style-sheet/enabled |
Toggle style sheet usage |
POST |
/api/sessions/:id/style-sheet/extract |
Extract style fields from prompt/reference |
Style-sheet extraction can require an API key/openai client. Image generation also supports provider: "api" through the shared Responses API image adapter when an API key is configured.
| Code | Meaning |
|---|---|
API_KEY_REQUIRED |
provider: "api" was requested without a configured API key |
APIKEY_DISABLED |
Legacy/deprecated hard-block code from older builds |
INVALID_IMAGE_MODEL |
Model name is unknown or unsupported |
IMAGE_MODEL_UNSUPPORTED |
Model exists but cannot use image generation |
INVALID_REQUEST |
Upstream request parameters are invalid; raw provider details may be included as upstreamCode, upstreamType, and upstreamParam |
INVALID_MODERATION |
Moderation value is not auto or low |
SAFETY_REFUSAL |
Upstream safety refusal |
MODERATION_REFUSED |
Content generation refused by moderation |
AUTH_CHATGPT_EXPIRED |
Codex/ChatGPT OAuth session expired |
AUTH_API_KEY_INVALID |
API key is invalid, revoked, out of quota, or wrong org |
NETWORK_FAILED |
Network, proxy, VPN, or firewall failure |
OAUTH_UNAVAILABLE |
Local OAuth proxy is not available |
OPEN_GENERATED_DIR_FAILED |
The server could not open the generated image folder |
GRAPH_VERSION_REQUIRED |
Missing graph If-Match header |
GRAPH_VERSION_CONFLICT |
Stale graph version |
GRAPH_TOO_LARGE |
Graph exceeds node/edge limits |
NODE_NOT_FOUND |
Node metadata was not found |
INVALID_GROK_IMAGE_MODEL |
A Grok request used a model outside grok-imagine-image or grok-imagine-image-quality |
GROK_RATE_LIMITED |
xAI returned a rate-limit response through progrok |
GROK_AUTH_FAILED |
progrok could not authenticate the xAI request |
GROK_SEARCH_TIMEOUT / GROK_PLANNER_TIMEOUT / GROK_IMAGE_TIMEOUT |
The Grok search, planner, or image API step exceeded its timeout budget |
AGY_GENERATION_FAILED |
Gemini (agy) image generation failed |
AGY_TIMEOUT |
Agy CLI process exceeded its 360-second timeout |
AGY_PROCESS_ERROR |
Agy CLI binary failed to start or crashed |
AGY_QUOTA_EXHAUSTED |
Gemini API quota exhausted (rate limit) |
AGY_PARSE_FAILED |
Could not parse artifact path from agy output |
AGY_ARTIFACT_NOT_FOUND |
Agy reported an artifact path that does not exist |
AGY_PATH_REJECTED |
Agy artifact path was outside allowed directories |
AGY_VIDEO_UNSUPPORTED |
Video generation is not supported by the Gemini (agy) provider |
AGY_MASK_UNSUPPORTED |
Mask-based editing is not supported by the Gemini (agy) provider |
AGY_REF_TOO_MANY |
Too many reference images for agy (max 3) |
GEMINI_API_KEY_MISSING |
Gemini API key or Vertex AI credentials not configured |
GEMINI_API_RATE_LIMITED |
Gemini API rate limited (429) |
GEMINI_API_BAD_REQUEST |
Gemini API bad request (400/403) |
GEMINI_API_SAFETY_BLOCKED |
Gemini API generation blocked by safety filter |
GEMINI_API_NO_IMAGE |
Gemini API returned no image in response |
VIDEO_PROVIDER_UNSUPPORTED |
Video generation requires provider "grok" or "grok-api" |
SSE_CAPACITY |
More than 512 concurrent GET /api/events listeners |
REQUEST_ID_IN_USE |
Async POST used a requestId that already has an active job |
TOO_MANY_JOBS |
More than the configured concurrent active generation job limit (Retry-After: 5; default 24) |
API key management endpoints for configuring provider credentials at runtime through the web UI or HTTP API.
| Endpoint | Method | Description |
|---|---|---|
/api/keys/status |
GET | Returns configured/valid/maskedKey status for all providers (openai, xai, gemini, vertex) plus geminiAuthMode ("apikey" or "vertex") |
/api/keys/:provider |
PUT | Save an API key. Body: { "apiKey": "..." }. Validates key format and upstream before saving to config.json. Provider: openai, xai, or gemini. |
/api/keys/:provider |
DELETE | Remove a config-sourced API key. Env-sourced keys cannot be removed (ENV_KEY_IMMUTABLE). |
/api/keys/vertex |
PUT | Save a Vertex AI service account JSON. Body: { "serviceAccountJson": "..." }. Validates JSON structure (type: "service_account", project_id required). |
/api/keys/vertex |
DELETE | Remove a config-sourced Vertex AI service account. |
/api/keys/gemini-auth-mode |
PUT | Persist the Gemini auth mode chosen in the settings dropdown. Body: { "mode": "apikey" | "vertex" }. Saved to config.json and hot-updated. |
Keys saved via PUT are stored in config.json and hot-updated in the runtime context (no server restart required). Keys loaded from environment variables (OPENAI_API_KEY, XAI_API_KEY, GEMINI_API_KEY, VERTEX_SERVICE_ACCOUNT_JSON) take precedence and are immutable through the API.
| Endpoint | Method | Description |
|---|---|---|
/api/history/backfill-thumbnails |
POST | Generate missing .thumb.jpg thumbnails for all images and videos in the generated directory. Returns { ok, total, created, skipped, failed }. Also available offline via ima2 backfill-thumbs. |
Thumbnails are also generated automatically on server startup for any media files that lack them.
Most server routes under /api/* have a CLI wrapper. The exception is Agent Mode (/api/agent/*), which is server + web-UI-only and has no ima2 subcommand. The prompt builder HTTP route (POST /api/prompt-builder/chat) is wrapped by ima2 prompt build. Use this table to find the command that calls a given endpoint. (See README.md "Client" section for full flag lists.)
| Endpoint | CLI |
|---|---|
POST /api/generate |
ima2 gen |
POST /api/edit |
ima2 edit |
POST /api/generate/multimode (SSE) |
ima2 multimode |
POST /api/video/generate (SSE) |
ima2 video |
POST /api/video/generate with continueFromVideo |
ima2 video continue |
POST /api/video/edit |
ima2 video edit |
POST /api/video/extend |
ima2 video extend |
GET /api/video/frame |
ima2 video frame |
POST /api/video/analyze |
ima2 video analyze |
POST /api/node/generate (SSE) / GET /api/node/:id |
ima2 node generate / ima2 node show |
GET /api/history |
ima2 ls |
DELETE /api/history/:name / …/permanent |
ima2 history rm [--permanent] |
POST /api/history/:filename/restore |
ima2 history restore --trash-id |
POST /api/history/favorite |
ima2 history favorite |
POST /api/history/import-local |
ima2 history import |
POST /api/metadata/read |
ima2 metadata / ima2 show --metadata |
GET/POST/PUT/DELETE /api/sessions[/…] |
ima2 session ls/show/create/rm/rename |
GET/PUT /api/sessions/:id/graph |
ima2 session graph load/save |
GET/PUT /api/sessions/:id/style-sheet[/…] |
ima2 session style-sheet … |
GET/PUT/DELETE /api/annotations/:name |
ima2 annotate get/set/rm |
POST /api/canvas-versions / PUT /api/canvas-versions/:name |
ima2 canvas-versions save/update |
GET/POST/PUT/DELETE /api/prompts[/…] |
ima2 prompt … |
GET/POST/PATCH/DELETE /api/prompts/folders[/…] |
ima2 prompt folder … |
…/api/prompts/import/… |
ima2 prompt import sources/refresh/curated/discovery/folder |
…/api/cardnews/… (gated on features.cardNews) |
ima2 cardnews … |
POST /api/comfy/export-image |
ima2 comfy export |
GET /api/inflight / DELETE /api/inflight/:id |
ima2 inflight ls (alias ps) / ima2 inflight rm (alias cancel) |
GET /api/events (SSE multiplex) |
Web UI only (persistent EventSource; no CLI wrapper) |
GET /api/storage/status / POST /api/storage/open-generated-dir |
ima2 storage status / ima2 storage open |
GET /api/billing / GET /api/providers / GET /api/oauth/status / GET /api/grok/status |
ima2 billing / ima2 providers / ima2 oauth status / ima2 grok status |
GET /api/quota |
ima2 billing (includes Grok usedUsd/limitUsd) |
POST /api/auth/switch / GET /api/auth/switch/:sessionId |
Web UI only (Settings > QuotaCard > Switch Account) |
GET /api/health |
ima2 ping |
GET /api/capabilities |
ima2 capabilities |
GET /api/config/grok-planner |
— (Grok planner model query) |
PATCH /api/config/grok-planner |
— (Grok planner model update) |
GET /api/agy/status |
— (Antigravity CLI install status) |
POST /api/history/backfill-thumbnails |
ima2 backfill-thumbs |
GET /api/keys/status, PUT/DELETE /api/keys/:provider, PUT/DELETE /api/keys/vertex |
Web UI only (Settings > API Keys) |
GET/POST/PATCH/DELETE /api/agent/* (sessions, turns, queue) |
— (Agent Mode; web UI only, no CLI) |
POST /api/prompt-builder/chat |
ima2 prompt build |
Notes:
ima2 history favoriteandima2 annotate …sendX-Ima2-Browser-Id: cli-<sha1prefix>derived from the config dir, so CLI activity does not collide with browser sessions.ima2 session graph saveperforms a GET-then-PUT withIf-Match: "<version>"to guard againstGRAPH_VERSION_CONFLICT.ima2 history importandima2 canvas-versions save/updatesend raw bytes withContent-Type: image/<png|jpeg|webp>; the SSE endpoints (multimode,node generate,video) useAccept: text/event-stream. The web UI instead usesGET /api/eventsplusasync: trueon POST routes.ima2 cardnews …checksruntimeConfig.features.cardNewsbefore calling the gated endpoints; when disabled the CLI exits 2 with a clear message instead of producing a 404.
The server writes an advertisement file at:
~/.ima2/server.json
CLI commands such as ima2 ping, ima2 gen, and ima2 ls use this file unless --server or IMA2_SERVER is provided.
Current shape:
{
"port": 3334,
"url": "http://localhost:3334",
"pid": 12345,
"startedAt": 1777180000000,
"version": "1.0.0",
"backend": {
"configuredPort": 3333,
"actualPort": 3334,
"url": "http://localhost:3334"
},
"oauth": {
"configuredPort": 10531,
"actualPort": 10532,
"url": "http://127.0.0.1:10532",
"status": "ready"
}
}Top-level port and url are kept for older CLI clients. New code should prefer backend.url.