API.md

API Reference

This document lists the local HTTP API exposed by ima2 serve.

Base URL:

http://localhost:3333

Provider Policy

Image generation supports OAuth, API-key, Grok, and Gemini (agy and gemini-api) providers.

provider: "oauth" uses the local Codex OAuth proxy.
provider: "api" uses the OpenAI Responses API with the hosted image_generation tool.
provider: "grok" uses the bundled progrok xAI proxy. Classic, Node, and Agent generation run mandatory xAI Web Search through /v1/responses, then run a grok-4.3 planner call with a forced local generate_image function, then ima2 executes xAI /v1/images/generations. If reference images, a Node parent image, or an Agent current image are attached, the final step switches to xAI /v1/images/edits so image-to-image context is preserved.
provider: "agy" spawns the Antigravity CLI (agy -p) to generate images via Google Gemini's default_api:generate_image tool. Model is nano-banana-2. Output is fixed at 1024×1024 JPEG. Max 3 reference images (i2i). No web search, quality, size, or mask controls. Multimode returns a single image. Video is unsupported (AGY_VIDEO_UNSUPPORTED).
provider: "grok-api" uses a direct xAI API key instead of the bundled progrok OAuth proxy. Same pipeline as grok (Web Search → planner → /v1/images/generations), same aspect ratio and resolution options. Requires an xAI API key configured via the web UI key management or XAI_API_KEY env var. Also supports video generation.
provider: "gemini-api" calls the Google Generative Language API directly (or Vertex AI with a service account JSON). Supports models nano-banana-2 (Gemini 3.1 Flash Image) and nano-banana-pro (Gemini 3 Pro Image). Supports variable aspect ratios (1:1 through 21:9) and four resolution tiers (512px, 1K, 2K, 4K) on both auth paths — the direct API path sends generation_config.response_format.image (snake_case) while the Vertex AI endpoint (aiplatform.googleapis.com) sends generationConfig.imageConfig (camelCase). With size: "auto" the image config is omitted entirely and the model decides ratio/size. Auth: GEMINI_API_KEY env var, web UI key management (/api/keys/gemini), or a Vertex AI service account JSON (VERTEX_SERVICE_ACCOUNT_JSON or /api/keys/vertex). When both Vertex credentials and an API key are configured, Vertex takes priority. The chosen auth mode (apikey or vertex) persists to ~/.ima2/config.json as geminiAuthMode and is restored on server startup. Per-model cost: nano-banana-2 (Flash): 512=$0.001, 1K=$0.003, 2K=$0.004, 4K=$0.006; nano-banana-pro: 1K=$0.007, 2K=$0.007, 4K=$0.013. No web search or mask controls.
API-key generation covers classic generate, edit, mask-guided edit, multimode, and node generation.
If provider: "api" is requested without an API key, routes fail before upstream with 401 and API_KEY_REQUIRED.
Grok generation maps size to xAI aspect_ratio and resolution; it does not send an OpenAI-style size field upstream. Grok edit uses xAI /v1/images/edits; Grok mask edit remains unsupported and returns GROK_MASK_UNSUPPORTED.
Mask edits are mask/selection guided edits, not pixel-perfect inpaint guarantees.

Grok video generation uses POST /api/video/generate (SSE). See the Video Generation section below for the full endpoint specification.

Health And Status

Method	Path	Notes
`GET`	`/api/health`	Server health, version, paths, provider policy
`GET`	`/api/providers`	Provider availability and runtime ports
`GET`	`/api/oauth/status`	OAuth proxy status and visible models
`GET`	`/api/grok/status`	Bundled progrok status and visible xAI image models
`GET`	`/api/billing`	Billing/status probe, including API key source when configured
`GET`	`/api/quota`	Provider quota: returns `{ codex, grok }`. Grok result includes `billing: { usedUsd, limitUsd }` and a `monthly` percent window drawn from the xAI billing API.

Account Switching

Method	Path	Notes
`POST`	`/api/auth/switch`	Start a device-code OAuth flow. Body: `{ "provider": "grok" \| "codex" }`. Returns `{ sessionId, userCode, verificationUrl }`.
`GET`	`/api/auth/switch/:sessionId`	Poll switch-account session status. Returns `{ status }` where status is `pending`, `complete`, `error`, or `expired`.

The Switch Account flow opens a browser verification URL. Once the user completes the device-code step, the server saves the new credentials (Grok: ~/.progrok/auth.json; Codex: via codex login --device-auth) and the session transitions to complete. This endpoint is surfaced as a Switch Account button in the Settings QuotaCard for Grok and Codex providers.

Storage

Method	Path	Notes
`GET`	`/api/storage/status`	Summarized gallery storage status for support UI
`POST`	`/api/storage/open-generated-dir`	Ask the server process to open the generated image folder

GET /api/storage/status returns a support-safe summary, not raw legacy path arrays by default.

{
  "ok": true,
  "data": {
    "generatedDirLabel": "~/.ima2/generated",
    "generatedCount": 0,
    "legacyCandidatesScanned": 18,
    "legacySourcesFound": 0,
    "legacyFilesFound": 0,
    "state": "not_found",
    "messageKind": "apology",
    "recoveryDocsPath": "docs/RECOVER_OLD_IMAGES.md",
    "doctorCommand": "ima2 doctor",
    "overrides": {
      "generatedDir": false,
      "configDir": false
    }
  }
}

Storage state values:

State	Meaning
`ok`	Current gallery has files or no recovery notice is needed
`recoverable`	Legacy folders/files are still present and may be recoverable
`not_found`	Current gallery is empty and no legacy folder was found
`unknown`	Storage status inspection failed or was incomplete

POST /api/storage/open-generated-dir opens the generated image folder on the machine running ima2 serve. If the browser is connected to a remote server, VM, container, WSL instance, or another computer on the network, this action targets that server machine, not necessarily the browser device.

In-Flight Jobs

Method	Path	Notes
`GET`	`/api/inflight`	Active jobs only by default
`GET`	`/api/inflight?includeTerminal=1`	Includes recent terminal jobs for debugging
`DELETE`	`/api/inflight/:requestId`	Cancel or forget an active job
`GET`	`/api/events`	Persistent SSE multiplex channel for all async generation progress (see below)

In-flight logs and responses use requestId for correlation. Logs should not include raw prompts, reference data URLs, generated base64, tokens, cookies, auth headers, or raw upstream bodies.

Events (SSE Multiplexing)

`GET /api/events` (SSE Multiplexing)

Single persistent Server-Sent Events channel that carries progress for all async generation jobs. The browser UI opens one EventSource here instead of holding a per-request SSE connection for each job, avoiding browser per-origin connection limits.

Query	Notes
`lastEventId`	Optional. Reconnect cursor; also accepted via the `Last-Event-ID` request header

Response: text/event-stream (persistent). Each frame uses standard SSE fields id, event, and data (JSON).

Connection limits: When active listeners reach 512, the server returns 503 with SSE_CAPACITY before opening the stream.

Heartbeat: Every 15 seconds the server writes a comment frame:

: ping

Replay: On reconnect, the server replays events from an in-memory ring buffer (size 2000) for IDs newer than lastEventId. Large image payloads (>1000 characters) are omitted from replay with _imageOmitted: true in the data payload. If the requested ID is older than the oldest buffered event, the server emits a replay-gap event before live fan-out:

Event	Data	Description
`replay-gap`	`{ lastEventId, oldestAvailableId }`	Client should reconcile inflight state (for example via `GET /api/inflight`)

Job routing: Every data payload includes jobId (same value as the job's requestId). Event bodies also carry requestId where applicable. Clients filter events by matching data.jobId or data.requestId to the job they started.

Event types (fan-out to all connected clients):

Event	Emitted by	Description
`phase`	node, multimode, video	Lifecycle phase change
`partial`	node, multimode	Progressive preview image (base64 data URL)
`image`	multimode	Final saved `GenerateItem` for one sequence image
`done`	node, multimode, video	Terminal success payload (route-specific shape)
`error`	all generation routes	Terminal failure
`submitted`	video	Job submitted to xAI
`progress`	video	Progress fraction 0.0–1.0
`planning`	video	Video planner running

Example SSE frame:

id: 42
event: phase
data: {"requestId":"req_abc","jobId":"req_abc","phase":"streaming"}

Async generation mode

POST /api/node/generate, POST /api/generate/multimode, and POST /api/video/generate support an async POST mode for clients that already hold GET /api/events:

{
  "async": true,
  "requestId": "req_xxx",
  "...": "other route fields"
}

Outcome	HTTP	Body
Accepted	`202`	`{ "requestId": "req_xxx" }`
Duplicate active `requestId`	`409`	`REQUEST_ID_IN_USE`
More than the configured concurrent active job limit	`429`	`TOO_MANY_JOBS` with `Retry-After: 5`; default limit is `24` via `IMA2_MAX_PARALLEL`

Progress events are published on GET /api/events. The POST response returns immediately; clients must not expect SSE on the POST connection when async: true.

CLI and legacy clients omit async and keep the original behavior: per-request SSE on the same POST response (Accept: text/event-stream where applicable). The server dual-emits in that mode — it writes SSE to the POST response and also publishes the same events on GET /api/events.

Generation

`POST /api/generate`

Text-to-image and reference-guided root generation.

{
  "prompt": "a shiba in space",
  "quality": "medium",
  "size": "1024x1024",
  "format": "png",
  "moderation": "low",
  "provider": "oauth",
  "model": "gpt-5.4",
  "references": [],
  "requestId": "optional-client-id",
  "storyboard": false
}

Supported quality values: low, medium, high.

Supported moderation values: auto, low.

When storyboard is true, the server prepends storyboard keyframe instructions so image generations maintain character and scene continuity for multi-shot video production.

Recommended model: gpt-5.4. Current app default: gpt-5.4-mini. gpt-5.5 is the strongest quality option when supported, but callers should expect higher quota pressure and possible Codex CLI/backend capability requirements.

When provider is "grok", supported models are grok-imagine-image and grok-imagine-image-quality. The server uses grok-4.3 as the search/planner model by default (IMA2_GROK_PLANNER_MODEL) and times the mandatory search and planner steps separately from the image call (IMA2_GROK_PLANNER_TIMEOUT_MS). For n > 1, search and planning run once and the planned prompt is reused for the image requests. Successful Grok classic generations report one mandatory web-search call in metadata.

If references are present on a Grok classic request, ima2 still performs the mandatory search and grok-4.3 planning phases. The planner receives the reference images as multimodal image_url inputs, and its forced generate_image.prompt argument is instructed to be English-only except for exact visible text requested by the user. The final image call then uses xAI /v1/images/edits with the same reference images instead of /v1/images/generations. This keeps image-to-image/reference context alive through the three-phase pipeline. xAI currently documents up to three source images for image editing, so Grok classic requests with more than three references return GROK_REF_TOO_MANY.

Grok size mapping:

Requested size	xAI `aspect_ratio`	xAI `resolution`
`1024x1024`	`1:1`	`1k`
`1536x1024`	`3:2`	`1k`
`1024x1536`	`2:3`	`1k`
`1360x1024`	`4:3`	`1k`
`1024x1360`	`3:4`	`1k`
`1824x1024`	`16:9`	`1k`
`1024x1824`	`9:16`	`1k`
`2048x2048`	`1:1`	`2k`
`2048x1152`	`16:9`	`2k`
`1152x2048`	`9:16`	`2k`
`3840x2160`	`16:9`	`2k`
`2160x3840`	`9:16`	`2k`
`auto`	`auto`	omitted

Custom sizes are reduced to the closest xAI-supported aspect ratio and use 2k when the requested longest edge or pixel budget is closer to a 2K image.

`POST /api/edit`

Image edit / image-to-image generation.

The request includes a prompt and image payload. provider: "api" sends the prompt and image through the shared Responses image adapter. Optional masks are forwarded as mask guidance, not a pixel-perfect edit guarantee.

With provider: "grok", edit requests are sent to xAI /v1/images/edits through the bundled progrok proxy. Masked Grok edits are rejected before upstream with GROK_MASK_UNSUPPORTED.

Grok multimode currently sends each image request directly to xAI Images API with the mapped aspect_ratio/resolution; the mandatory search + planner pipeline is limited to classic /api/generate.

`POST /api/node/generate`

Node-mode generation and child edits.

Body fields:

{
  "parentNodeId": "optional-server-node-id",
  "prompt": "continue this image",
  "quality": "medium",
  "size": "1024x1024",
  "format": "png",
  "moderation": "low",
  "model": "grok-imagine-image",
  "references": [],
  "externalSrc": "optional-history-url",
  "sessionId": "session-id",
  "clientNodeId": "client-node-id",
  "requestId": "request-id",
  "provider": "grok"
}

When parentNodeId is present, the server loads the stored parent node image and uses the edit path. Node-local references are allowed on both root and child/edit nodes; for child/edit nodes the parent image is sent first, then references, then the text prompt.

With provider: "grok", Node Mode uses the same xAI search + grok-4.3 planner + Images API pipeline as classic generation. A parent node image, externalSrc, or extra references are passed to the planner and then to xAI /v1/images/edits; otherwise the final call uses /v1/images/generations. Grok Node requests are capped at three total input images, counting the parent/current image plus references, and return GROK_REF_TOO_MANY before upstream when that limit is exceeded. quality: "high" promotes the final image model to grok-imagine-image-quality.

The route can stream Server-Sent Events when the client sends Accept: text/event-stream. Possible events include phase, partial, done, and error. Alternatively, send { "async": true, "requestId": "req_xxx" } in the body to receive 202 { requestId } immediately and follow progress on GET /api/events (see Events section).

Grok Node SSE responses do not include Responses API partial image events because the xAI Images API call is synchronous JSON. They still emit phase and done/error events so the Node UI can use the same in-flight lifecycle.

`POST /api/generate/multimode` (SSE)

Multi-image sequence generation. SSE-only on the POST response unless async mode is used.

{
  "prompt": "a story in four panels",
  "maxImages": 4,
  "quality": "medium",
  "size": "1024x1024",
  "format": "png",
  "moderation": "low",
  "model": "gpt-5.4",
  "provider": "oauth",
  "references": [],
  "requestId": "optional-client-id",
  "async": false
}

Send Accept: text/event-stream for per-request SSE on the POST connection. Or set "async": true with a client requestId to get 202 { requestId } and receive events on GET /api/events.

SSE events:

Event	Data	Description
`phase`	`{ requestId, phase, sequenceId?, maxImages? }`	Lifecycle phase
`partial`	`{ requestId, image, index }`	Progressive preview
`image`	full `GenerateItem`	One saved sequence image
`done`	route-specific summary; may include `status: "partial"` after timeout if at least one image was saved	Sequence complete
`error`	`{ requestId, error, code?, status? }`	Generation failed

`GET /api/node/:nodeId`

Fetch stored node metadata and asset URL.

Reference Images

Reference uploads are capped at 5 items. The frontend compresses large JPEG/PNG files before sending them. HEIC/HEIF files are rejected with a user-facing conversion hint.

Server-side validation may return these reference codes:

Code	Meaning
`REF_NOT_ARRAY`	`references` was not an array
`REF_TOO_MANY`	More than the configured reference count
`REF_NOT_STRING`	A reference item was not a string
`REF_EMPTY`	A reference item was empty
`REF_TOO_LARGE`	A reference exceeded the configured base64 size
`REF_NOT_BASE64`	A reference was not valid base64
`GROK_REF_TOO_MANY`	Grok classic generation received more than three reference images
`GROK_MASK_UNSUPPORTED`	Grok edit was requested with a mask; xAI mask edit is not wired in this release

Video Generation

`POST /api/video/generate` (SSE)

Generate a video via the Grok video provider. Returns Server-Sent Events on the POST connection, or accepts async mode ({ "async": true, "requestId": "req_xxx" }) for 202 { requestId } with progress on GET /api/events (see Events section).

{
  "prompt": "a cat playing piano",
  "provider": "grok",
  "model": "grok-imagine-video",
  "duration": 5,
  "resolution": "480p",
  "aspectRatio": "auto",
  "sourceImage": "<base64>",
  "referenceImages": ["<base64>", "<base64>"],
  "referenceFilenames": ["existing-file.png"],
  "continueFromVideo": "1780226256355_50252101.mp4",
  "continuityLineage": { "lineageId": "optional-client-hint", "entries": [] },
  "sessionId": "optional",
  "requestId": "optional-client-id"
}

Models: grok-imagine-video (default), grok-imagine-video-1.5-preview.

Mode is auto-detected from reference inputs:

Inputs	Mode	Duration cap
No images	text-to-video	1–15s
1 image (`sourceImage` or `sourceFilename`)	image-to-video	1–15s
2–7 images (`referenceImages` / `referenceFilenames`)	reference-to-video	1–10s

Parameters:

Field	Type	Default	Notes
`prompt`	string	—	Required
`provider`	string	`"grok"`	`"grok"` or `"grok-api"`
`model`	string	`grok-imagine-video`	Video model
`duration`	integer	`5`	1–15 seconds (clamped to 10 for reference-to-video)
`resolution`	string	`"480p"`	`480p` or `720p`
`aspectRatio`	string	`"auto"`	1:1, 16:9, 9:16, 4:3, 3:4, 3:2, 2:3, auto
`sourceImage`	string	—	Base64 image for image-to-video
`sourceFilename`	string	—	Existing generated file for image-to-video
`referenceImages`	string[]	—	Base64 images for reference-to-video
`referenceFilenames`	string[]	—	Existing generated files for reference-to-video
`continueFromVideo`	string	—	Generated `.mp4` parent; server extracts its last frame and rebuilds lineage from sidecar
`continuityLineage`	object	—	Optional client hint; used only when `continueFromVideo` is absent
`plannerModel`	string	`grok-4.3`	Grok video planner model override (also via settings UI or `IMA2_GROK_PLANNER_MODEL`)
`storyboard`	boolean	`false`	Enable storyboard mode — maintains character/scene continuity across sequential clips

Blank prompts return PROMPT_REQUIRED with a guidance string. The active prompt should describe visual flow, motion flow, sound/music/no-music, dialogue/no-dialogue, ending frame, and duration pacing. The video planner uses the selected duration as the full clip runtime and expands short requests into a production-level sequence with opening composition, connected motion/emotion change, and a stable ending frame suitable for continuation. For multi-character scenes, the planner identifies speakers by visual appearance (clothing, physique, position, props) rather than names, and attributes each dialogue line accordingly.

When continueFromVideo is present, the server treats the generated .mp4 sidecar as authoritative. Client continuityLineage cannot override it. The saved child sidecar includes videoContinuity, a branch-local max-4 stack using keep-start-plus-latest-3 retention.

videoContinuity shape:

{
  "lineageId": "lineage:parent",
  "parentFilename": "parent.mp4",
  "sourceFrame": "last",
  "maxEntries": 4,
  "retention": "keep-start-plus-latest-3",
  "entries": [
    {
      "id": "clip:parent.mp4",
      "ordinal": 1,
      "role": "start",
      "filename": "parent.mp4",
      "userPrompt": "original user prompt",
      "revisedPrompt": "planner prompt actually sent to Grok video",
      "createdAt": 1780300000000
    }
  ]
}

Entry role is start, ancestor, parent, or current. The first clip is kept as the start anchor; later generations keep only the latest three entries. lineageId uses the generated video basename without the .mp4 extension. This metadata is stored in the generated .mp4.json sidecar and returned in history rows and video done events; /generated/*.json remains private.

Grok prompt surfaces used by video APIs:

Surface	Model	Responsibility
Video planner	`grok-4.3` (override via `plannerModel`)	Converts user prompt, search context, refs, and optional continuity lineage into the final English video prompt. It must structure core subject, action/motion, camera/composition, environment/style, dialogue/audio, ending-frame handoff, and constraints. Multi-character dialogue uses appearance-based speaker identification.
Video generation	xAI video model	Receives the planner prompt plus `sourceImage` or `referenceImages` when present.
Video analysis	`grok-4.3`	Reads first/last frame images from `/api/video/analyze` and returns recreation/continuation guidance.

SSE events:

Event	Data	Description
`planning`	`{ requestId }`	Preparing video generation
`submitted`	`{ requestId, xaiVideoRequestId, requestedModel, effectiveModel, modelFallback }`	Submitted to xAI
`progress`	`{ requestId, progress, stalled }`	Progress 0.0–1.0
`done`	`{ requestId, filename, url, mediaType, revisedPrompt, elapsed, usage, requestedModel, effectiveModel, modelFallback, video, videoContinuity }`	Video ready
`error`	`{ error, code, status, requestId, guidance? }`	Generation failed

Video error codes:

Code	Meaning
`VIDEO_PROVIDER_UNSUPPORTED`	Provider is not `"grok"`
`PROMPT_REQUIRED`	Empty or missing prompt
`INVALID_GROK_VIDEO_MODEL`	Model not in valid set
`INVALID_VIDEO_RESOLUTION`	Resolution not 480p or 720p
`INVALID_VIDEO_ASPECT_RATIO`	Aspect ratio not in valid set
`INVALID_VIDEO_DURATION`	Duration not 1–15 integer
`GROK_VIDEO_REF_TOO_MANY`	More than 7 reference images
`GROK_VIDEO_FAILED`	Upstream xAI video generation failed
`GROK_VIDEO_FRAME_FAILED`	Server could not extract the parent video's last frame

`POST /api/video/edit`

Edit an existing video via Grok V2V. This is a blocking JSON endpoint that starts the xAI edit job, polls it, downloads the final MP4, and saves it as a generated video artifact.

{
  "prompt": "make it sunset",
  "videoUrl": "https://vidgen.x.ai/.../clip.mp4",
  "model": "grok-imagine-video"
}

videoUrl may be an HTTPS video URL, xAI file_id, data:video/* URL, or generated .mp4 filename. Generated-file inputs are restricted to real .mp4 files under the generated directory.

`POST /api/video/extend`

Extend a video from its last frame. This is a blocking JSON endpoint that starts the xAI extension job, polls it, downloads the combined output MP4, and saves it as a generated video artifact.

{
  "prompt": "camera pulls back",
  "videoUrl": "1780226256355_50252101.mp4",
  "duration": 6,
  "model": "grok-imagine-video"
}

duration must be an integer from 2 to 10 seconds. Edit and extension support grok-imagine-video only; grok-imagine-video-1.5-preview is not accepted for these endpoints.

`GET /api/video/frame`

Extract a PNG frame from a generated .mp4 file.

Query	Notes
`file`	Required generated `.mp4` filename or generated-dir absolute path
`position`	`last` (default) or non-negative seconds

`POST /api/video/analyze`

Analyze first and last frames from a generated .mp4 using Grok 4.3 image understanding. This does not upload the video as temporal video; it extracts two PNG frames and asks the vision model to infer likely motion.

{
  "videoUrl": "1780226256355_50252101.mp4"
}

Remote URLs and data: inputs are intentionally rejected to avoid server-side URL fetching through ffmpeg.

History

Method	Path	Notes
`GET`	`/api/history`	List generated assets
`GET`	`/api/history?groupBy=session`	Group assets by session title
`DELETE`	`/api/history/:filename`	Tombstone a generated asset
`POST`	`/api/history/:filename/restore`	Restore a recently deleted asset

History rows can include node metadata such as sessionId, nodeId, clientNodeId, requestId, and refsCount.

Sessions And Graphs

Method	Path	Notes
`GET`	`/api/sessions`	List graph sessions
`POST`	`/api/sessions`	Create a session
`GET`	`/api/sessions/:id`	Load a session and graph
`PATCH`	`/api/sessions/:id`	Rename a session
`DELETE`	`/api/sessions/:id`	Delete a session
`PUT`	`/api/sessions/:id/graph`	Save graph snapshot

PUT /api/sessions/:id/graph requires an If-Match header containing the current graph version.

Version mismatch returns GRAPH_VERSION_CONFLICT and the current version. This only means the client saved against a stale graph version; it is not proof that another browser tab changed the graph.

Graph save requests may include observability headers:

X-Ima2-Graph-Save-Id
X-Ima2-Graph-Save-Reason
X-Ima2-Tab-Id

Style Sheets

Method	Path	Notes
`GET`	`/api/sessions/:id/style-sheet`	Load session style sheet
`PUT`	`/api/sessions/:id/style-sheet`	Save style sheet
`PATCH`	`/api/sessions/:id/style-sheet/enabled`	Toggle style sheet usage
`POST`	`/api/sessions/:id/style-sheet/extract`	Extract style fields from prompt/reference

Style-sheet extraction can require an API key/openai client. Image generation also supports provider: "api" through the shared Responses API image adapter when an API key is configured.

Common Error Codes

Code	Meaning
`API_KEY_REQUIRED`	`provider: "api"` was requested without a configured API key
`APIKEY_DISABLED`	Legacy/deprecated hard-block code from older builds
`INVALID_IMAGE_MODEL`	Model name is unknown or unsupported
`IMAGE_MODEL_UNSUPPORTED`	Model exists but cannot use image generation
`INVALID_REQUEST`	Upstream request parameters are invalid; raw provider details may be included as `upstreamCode`, `upstreamType`, and `upstreamParam`
`INVALID_MODERATION`	Moderation value is not `auto` or `low`
`SAFETY_REFUSAL`	Upstream safety refusal
`MODERATION_REFUSED`	Content generation refused by moderation
`AUTH_CHATGPT_EXPIRED`	Codex/ChatGPT OAuth session expired
`AUTH_API_KEY_INVALID`	API key is invalid, revoked, out of quota, or wrong org
`NETWORK_FAILED`	Network, proxy, VPN, or firewall failure
`OAUTH_UNAVAILABLE`	Local OAuth proxy is not available
`OPEN_GENERATED_DIR_FAILED`	The server could not open the generated image folder
`GRAPH_VERSION_REQUIRED`	Missing graph `If-Match` header
`GRAPH_VERSION_CONFLICT`	Stale graph version
`GRAPH_TOO_LARGE`	Graph exceeds node/edge limits
`NODE_NOT_FOUND`	Node metadata was not found
`INVALID_GROK_IMAGE_MODEL`	A Grok request used a model outside `grok-imagine-image` or `grok-imagine-image-quality`
`GROK_RATE_LIMITED`	xAI returned a rate-limit response through progrok
`GROK_AUTH_FAILED`	progrok could not authenticate the xAI request
`GROK_SEARCH_TIMEOUT` / `GROK_PLANNER_TIMEOUT` / `GROK_IMAGE_TIMEOUT`	The Grok search, planner, or image API step exceeded its timeout budget
`AGY_GENERATION_FAILED`	Gemini (agy) image generation failed
`AGY_TIMEOUT`	Agy CLI process exceeded its 360-second timeout
`AGY_PROCESS_ERROR`	Agy CLI binary failed to start or crashed
`AGY_QUOTA_EXHAUSTED`	Gemini API quota exhausted (rate limit)
`AGY_PARSE_FAILED`	Could not parse artifact path from agy output
`AGY_ARTIFACT_NOT_FOUND`	Agy reported an artifact path that does not exist
`AGY_PATH_REJECTED`	Agy artifact path was outside allowed directories
`AGY_VIDEO_UNSUPPORTED`	Video generation is not supported by the Gemini (agy) provider
`AGY_MASK_UNSUPPORTED`	Mask-based editing is not supported by the Gemini (agy) provider
`AGY_REF_TOO_MANY`	Too many reference images for agy (max 3)
`GEMINI_API_KEY_MISSING`	Gemini API key or Vertex AI credentials not configured
`GEMINI_API_RATE_LIMITED`	Gemini API rate limited (429)
`GEMINI_API_BAD_REQUEST`	Gemini API bad request (400/403)
`GEMINI_API_SAFETY_BLOCKED`	Gemini API generation blocked by safety filter
`GEMINI_API_NO_IMAGE`	Gemini API returned no image in response
`VIDEO_PROVIDER_UNSUPPORTED`	Video generation requires provider `"grok"` or `"grok-api"`
`SSE_CAPACITY`	More than 512 concurrent `GET /api/events` listeners
`REQUEST_ID_IN_USE`	Async POST used a `requestId` that already has an active job
`TOO_MANY_JOBS`	More than the configured concurrent active generation job limit (`Retry-After: 5`; default `24`)

Key Management

API key management endpoints for configuring provider credentials at runtime through the web UI or HTTP API.

Endpoint	Method	Description
`/api/keys/status`	GET	Returns configured/valid/maskedKey status for all providers (openai, xai, gemini, vertex) plus `geminiAuthMode` (`"apikey"` or `"vertex"`)
`/api/keys/:provider`	PUT	Save an API key. Body: `{ "apiKey": "..." }`. Validates key format and upstream before saving to config.json. Provider: `openai`, `xai`, or `gemini`.
`/api/keys/:provider`	DELETE	Remove a config-sourced API key. Env-sourced keys cannot be removed (`ENV_KEY_IMMUTABLE`).
`/api/keys/vertex`	PUT	Save a Vertex AI service account JSON. Body: `{ "serviceAccountJson": "..." }`. Validates JSON structure (`type: "service_account"`, `project_id` required).
`/api/keys/vertex`	DELETE	Remove a config-sourced Vertex AI service account.
`/api/keys/gemini-auth-mode`	PUT	Persist the Gemini auth mode chosen in the settings dropdown. Body: `{ "mode": "apikey" \| "vertex" }`. Saved to `config.json` and hot-updated.

Keys saved via PUT are stored in config.json and hot-updated in the runtime context (no server restart required). Keys loaded from environment variables (OPENAI_API_KEY, XAI_API_KEY, GEMINI_API_KEY, VERTEX_SERVICE_ACCOUNT_JSON) take precedence and are immutable through the API.

Thumbnail Backfill

Endpoint	Method	Description
`/api/history/backfill-thumbnails`	POST	Generate missing `.thumb.jpg` thumbnails for all images and videos in the generated directory. Returns `{ ok, total, created, skipped, failed }`. Also available offline via `ima2 backfill-thumbs`.

Thumbnails are also generated automatically on server startup for any media files that lack them.

Endpoint → CLI Mapping

Most server routes under /api/* have a CLI wrapper. The exception is Agent Mode (/api/agent/*), which is server + web-UI-only and has no ima2 subcommand. The prompt builder HTTP route (POST /api/prompt-builder/chat) is wrapped by ima2 prompt build. Use this table to find the command that calls a given endpoint. (See README.md "Client" section for full flag lists.)

Endpoint	CLI
`POST /api/generate`	`ima2 gen`
`POST /api/edit`	`ima2 edit`
`POST /api/generate/multimode` (SSE)	`ima2 multimode`
`POST /api/video/generate` (SSE)	`ima2 video`
`POST /api/video/generate` with `continueFromVideo`	`ima2 video continue`
`POST /api/video/edit`	`ima2 video edit`
`POST /api/video/extend`	`ima2 video extend`
`GET /api/video/frame`	`ima2 video frame`
`POST /api/video/analyze`	`ima2 video analyze`
`POST /api/node/generate` (SSE) / `GET /api/node/:id`	`ima2 node generate` / `ima2 node show`
`GET /api/history`	`ima2 ls`
`DELETE /api/history/:name` / `…/permanent`	`ima2 history rm [--permanent]`
`POST /api/history/:filename/restore`	`ima2 history restore --trash-id`
`POST /api/history/favorite`	`ima2 history favorite`
`POST /api/history/import-local`	`ima2 history import`
`POST /api/metadata/read`	`ima2 metadata` / `ima2 show --metadata`
`GET/POST/PUT/DELETE /api/sessions[/…]`	`ima2 session ls/show/create/rm/rename`
`GET/PUT /api/sessions/:id/graph`	`ima2 session graph load/save`
`GET/PUT /api/sessions/:id/style-sheet[/…]`	`ima2 session style-sheet …`
`GET/PUT/DELETE /api/annotations/:name`	`ima2 annotate get/set/rm`
`POST /api/canvas-versions` / `PUT /api/canvas-versions/:name`	`ima2 canvas-versions save/update`
`GET/POST/PUT/DELETE /api/prompts[/…]`	`ima2 prompt …`
`GET/POST/PATCH/DELETE /api/prompts/folders[/…]`	`ima2 prompt folder …`
`…/api/prompts/import/…`	`ima2 prompt import sources/refresh/curated/discovery/folder`
`…/api/cardnews/…` (gated on `features.cardNews`)	`ima2 cardnews …`
`POST /api/comfy/export-image`	`ima2 comfy export`
`GET /api/inflight` / `DELETE /api/inflight/:id`	`ima2 inflight ls` (alias `ps`) / `ima2 inflight rm` (alias `cancel`)
`GET /api/events` (SSE multiplex)	Web UI only (persistent `EventSource`; no CLI wrapper)
`GET /api/storage/status` / `POST /api/storage/open-generated-dir`	`ima2 storage status` / `ima2 storage open`
`GET /api/billing` / `GET /api/providers` / `GET /api/oauth/status` / `GET /api/grok/status`	`ima2 billing` / `ima2 providers` / `ima2 oauth status` / `ima2 grok status`
`GET /api/quota`	`ima2 billing` (includes Grok `usedUsd`/`limitUsd`)
`POST /api/auth/switch` / `GET /api/auth/switch/:sessionId`	Web UI only (Settings > QuotaCard > Switch Account)
`GET /api/health`	`ima2 ping`
`GET /api/capabilities`	`ima2 capabilities`
`GET /api/config/grok-planner`	— (Grok planner model query)
`PATCH /api/config/grok-planner`	— (Grok planner model update)
`GET /api/agy/status`	— (Antigravity CLI install status)
`POST /api/history/backfill-thumbnails`	`ima2 backfill-thumbs`
`GET /api/keys/status`, `PUT/DELETE /api/keys/:provider`, `PUT/DELETE /api/keys/vertex`	Web UI only (Settings > API Keys)
`GET/POST/PATCH/DELETE /api/agent/*` (sessions, turns, queue)	— (Agent Mode; web UI only, no CLI)
`POST /api/prompt-builder/chat`	`ima2 prompt build`

Notes:

ima2 history favorite and ima2 annotate … send X-Ima2-Browser-Id: cli-<sha1prefix> derived from the config dir, so CLI activity does not collide with browser sessions.
ima2 session graph save performs a GET-then-PUT with If-Match: "<version>" to guard against GRAPH_VERSION_CONFLICT.
ima2 history import and ima2 canvas-versions save/update send raw bytes with Content-Type: image/<png|jpeg|webp>; the SSE endpoints (multimode, node generate, video) use Accept: text/event-stream. The web UI instead uses GET /api/events plus async: true on POST routes.
ima2 cardnews … checks runtimeConfig.features.cardNews before calling the gated endpoints; when disabled the CLI exits 2 with a clear message instead of producing a 404.

CLI Discovery

The server writes an advertisement file at:

~/.ima2/server.json

CLI commands such as ima2 ping, ima2 gen, and ima2 ls use this file unless --server or IMA2_SERVER is provided.

Current shape:

{
  "port": 3334,
  "url": "http://localhost:3334",
  "pid": 12345,
  "startedAt": 1777180000000,
  "version": "1.0.0",
  "backend": {
    "configuredPort": 3333,
    "actualPort": 3334,
    "url": "http://localhost:3334"
  },
  "oauth": {
    "configuredPort": 10531,
    "actualPort": 10532,
    "url": "http://127.0.0.1:10532",
    "status": "ready"
  }
}

Top-level port and url are kept for older CLI clients. New code should prefer backend.url.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

API Reference

Provider Policy

Health And Status

Account Switching

Storage

In-Flight Jobs

Events (SSE Multiplexing)

`GET /api/events` (SSE Multiplexing)

Async generation mode

Generation

`POST /api/generate`

`POST /api/edit`

`POST /api/node/generate`

`POST /api/generate/multimode` (SSE)

`GET /api/node/:nodeId`

Reference Images

Video Generation

`POST /api/video/generate` (SSE)

`POST /api/video/edit`

`POST /api/video/extend`

`GET /api/video/frame`

`POST /api/video/analyze`

History

Sessions And Graphs

Style Sheets

Common Error Codes

Key Management

Thumbnail Backfill

Endpoint → CLI Mapping

CLI Discovery

FilesExpand file tree

API.md

Latest commit

History

API.md

File metadata and controls

API Reference

Provider Policy

Health And Status

Account Switching

Storage

In-Flight Jobs

Events (SSE Multiplexing)

GET /api/events (SSE Multiplexing)

Async generation mode

Generation

POST /api/generate

POST /api/edit

POST /api/node/generate

POST /api/generate/multimode (SSE)

GET /api/node/:nodeId

Reference Images

Video Generation

POST /api/video/generate (SSE)

POST /api/video/edit

POST /api/video/extend

GET /api/video/frame

POST /api/video/analyze

History

Sessions And Graphs

Style Sheets

Common Error Codes

Key Management

Thumbnail Backfill

Endpoint → CLI Mapping

CLI Discovery

`GET /api/events` (SSE Multiplexing)

`POST /api/generate`

`POST /api/edit`

`POST /api/node/generate`

`POST /api/generate/multimode` (SSE)

`GET /api/node/:nodeId`

`POST /api/video/generate` (SSE)

`POST /api/video/edit`

`POST /api/video/extend`

`GET /api/video/frame`

`POST /api/video/analyze`