Skip to content

Support Kitty graphics protocol mvp#5619

Merged
Tyriar merged 79 commits intomasterfrom
anthonykim1/scaffoldKittyAddon
Feb 16, 2026
Merged

Support Kitty graphics protocol mvp#5619
Tyriar merged 79 commits intomasterfrom
anthonykim1/scaffoldKittyAddon

Conversation

@anthonykim1
Copy link
Contributor

@anthonykim1 anthonykim1 commented Jan 22, 2026

Part of #5592

WIP

See discussion for components in spec, their progress/discussion: #5683

Plan:

  1. (Done with Add register apc handler  #5617) First PR: Add registerAPCHandler to parser and expose in API, add tests
  2. Scaffold addon:
    • Create addon-kitty-graphics, copy structure from other addons
    • Hook up in demo client.ts
      • Search addon-progress and ProgressAddon for how to do this
    • Get it working and activating
      • (Checked, remember to remove comments) Add a console.log to the addon's activate function
  3. Come up with a simple test command to verify in kitty or ghostty (see send-png)
  4. Set up APC handler, get it to trigger with the test
  5. Create a playwright test (test folder) in the addon that automates it for a simple image
    • Use single 1x1 black pixel png image, write it to the terminal in your test, verify the image we see is black
    • Use 3x1 png image, 255 red first, 255 green second, 255 blue third
  6. Understand what APC sequences are being sent by the program
  7. Implement the handlers
  8. Draw to a canvas layer
  9. Make sure we get the playwright tests to pass.

(Jan 22, 4:00 PM PST) There's a problem currently where kitty kitten +icat wont return image but using send-image file will.

Copilot's research on why normal kitty kitten +icat doesnt work on xterm demo rn:

kitty icat uses TIOCGWINSZ ioctl to get pixel dimensions from the PTY before sending any graphics. node-pty doesn't set pixel values in that ioctl, so it returns 0. kitty icat sees 0 pixels and errors out before even trying the graphics protocol.

(Jan 22, 6:30 PM PST)
Things like: kitty +kitten icat --use-window-size=80,24,1280,768 --transfer-mode=stream ~/Desktop/xterm.js/demo/logo.png work atm

~TODO since image goes on top of text rn 🥶:

Calculate how many rows/cols the image spans
Move the cursor accordingly (or reserve blank space)

--> Handled by image storage and c=1, c=0 logic. See discussion page.

TODOs

@anthonykim1 anthonykim1 self-assigned this Jan 22, 2026
@anthonykim1 anthonykim1 marked this pull request as draft January 22, 2026 19:57
@anthonykim1 anthonykim1 changed the title Support Kitty graphics protocol Support Kitty graphics protocol mvp Jan 22, 2026
@jerch
Copy link
Member

jerch commented Jan 23, 2026

...node-pty doesn't set pixel values in that ioctl...

Imho this should be fixed in node-pty to populate the pixel dimensions where possible (I think there was an issue with conpty not providing it, but for other platforms it can be populated)

Another workaround for console apps is to use WinOps sequences like CSI 14t (see here for an example https://github.com/jerch/sixel-puzzle/blob/56129538af70fa4ec9441d1d3553398dc8e66f1f/termlib.py#L95) But of couse this is beyond xterm.js and has to be implemented by the console app itself.

@jerch
Copy link
Member

jerch commented Jan 23, 2026

@anthonykim1 I start to wonder, if the kitty graphics handler should be integrated into the image addon. I don't know the requirements of this protocol for the render layering, but maybe things could be reused and unified? It would make the lifecycling easier and would not introduce another image layer.

But as I said - i dont know the kitty protocol requirements, so fusing it might be a futile attempt. Thoughts?

@anthonykim1
Copy link
Contributor Author

anthonykim1 commented Feb 13, 2026

In the efforts to not falsely inform/respond back to clients:

1) Query action (a=q)
Scenario Example control data Current behavior Spec quote
Capability probe (no payload) a=q,i=1,t=d OK "To check if a terminal emulator supports the graphics protocol the best way is to send the above query action …"
PNG query, valid base64 payload a=q,i=1,f=100,t=d;<base64> OK "The terminal emulator must understand pixel data in three formats, 24-bit RGB, 32-bit RGBA and PNG."
PNG query, invalid base64 a=q,i=1,f=100,t=d;!!! EINVAL:invalid base64 data "The payload is arbitrary binary data, base64 encoded …"
PNG query, valid base64 but malformed PNG bytes a=q,i=1,f=100,t=d;<malformed png bytes> OK at query-time; malformed data is rejected during render (a=T) "set a=q. Then the terminal emulator will try to load the image and respond with either OK or an error …" + PNG support requirement above
Raw RGB/RGBA with valid dimensions and enough bytes a=q,i=1,f=24,s=1,v=1,t=d;AAAA OK "the width and height are specified using the s and v keys respectively … pixel data must be … bytes."
Raw RGB/RGBA missing s/v a=q,i=1,f=24,t=d;AAAA EINVAL:width and height required for raw pixel data Same s/v requirement quote above
Raw RGB/RGBA insufficient bytes a=q,i=1,f=24,s=10,v=10,t=d;AAAA EINVAL:insufficient pixel data Same byte-count requirement quote above
Unsupported medium t=f a=q,i=1,t=f;<base64 path> EINVAL:unsupported transmission medium "The t key defaults to d and can take the values: d, f, t, s."
Unsupported medium t=t a=q,i=1,t=t;<base64 path> EINVAL:unsupported transmission medium Same quote above
Unsupported medium t=s a=q,i=1,t=s;<base64 shm> EINVAL:unsupported transmission medium Same quote above
2) Non-query actions (with i present)
Action/Scenario Example Current behavior Spec quote
Transmit (a=t) success a=t,i=1,f=100;<png base64> OK "terminal emulator will reply after trying to load the image … The string will be OK if reading the pixel data succeeded or an error message."
Transmit (a=t) decode error a=t,i=1;!!! EINVAL:invalid base64 data Same response quote above
Transmit (a=t) unsupported medium with i a=t,i=1,t=f;<base64 path> EINVAL:unsupported transmission medium "The t key defaults to d and can take the values: d, f, t, s."
Transmit (a=t) unsupported medium without i a=t,t=f;<base64 path> No response (silent rejection) Response contract is tied to sending i
Transmit+display (a=T) success a=T,i=1,f=100;<png base64> OK "simultaneously transmit and display an image using the action a=T"
Transmit+display (a=T) decode error a=T,i=1;!!! EINVAL:invalid base64 data Error response on failed load/processing
Transmit+display (a=T) render failure a=T,i=1,f=100;<corrupt image> EINVAL:image rendering failed Error response on failed load/processing
Transmit+display (a=T) unsupported medium with i a=T,i=1,t=f;<base64 path> EINVAL:unsupported transmission medium "The t key defaults to d and can take the values: d, f, t, s."
Transmit+display (a=T) unsupported medium without i a=T,t=f;<base64 path> No response (silent rejection) Response contract is tied to sending i
Delete (a=d) a=d,d=a No response "Images can be deleted by using the delete action a=d."
Placement (a=p) not implemented a=p,i=1 EINVAL:unsupported action Control reference includes a=p (put/display previous transmitted image)
Frame upload (a=f) not implemented a=f,i=1 EINVAL:unsupported action Control reference includes a=f (transmit animation frame data)
Animation control (a=a) not implemented a=a,i=1 EINVAL:unsupported action Control reference includes a=a (control animation)
Frame compose (a=c) not implemented a=c,i=1 EINVAL:unsupported action Control reference includes a=c (compose animation frames)
3) Validation & key rules
Rule Example Current behavior Spec quote
Both i and I specified a=t,i=1,I=2 EINVAL:cannot specify both i and I keys "Specifying both i and I keys in any command is an error. The terminal must reply with an EINVAL error message, unless silenced."
Query with no explicit t a=q,i=1 Treated as t=d "The t key defaults to d …"
Query with no explicit f a=q,i=1,s=1,v=1;... Treated as f=32 (RGBA) Control data reference: f default is 32
4) Quiet mode behavior (q)
q value OK responses Error responses Spec quote
q=0 (default) Sent Sent Control data reference default behavior
q=1 Suppressed Sent "Set it to 1 to suppress OK responses …"
q=2 Sent Suppressed "… and to 2 to suppress failure responses."
5) Response-gating behavior by id
Scenario Current behavior Spec quote
a=q with i present Sends i=<id>;OK or i=<id>;EINVAL:... (subject to q) Query section + response format examples
a=q with no i Sends with i=0 Query behavior is still expected to reply; control data ref gives i default 0
a=t / a=T with i present Sends OK/error response (subject to q) "If it does so [uses i], the terminal emulator will reply … OK or an error message."
a=t / a=T with no i No response on failure/success Spec response contract is tied to sending i

Notes:

  • Current implementation rejects t=f/t/s across all actions (a=q, a=t, a=T) with EINVAL:unsupported transmission medium in browser context, with TODOs for future filesystem/shared-memory support.
  • a=p/a=f/a=a/a=c are explicitly rejected with EINVAL:unsupported action (when i is present), so unsupported features are not silently accepted.
  • Query for PNG confirms support capability; malformed PNG payload detection is enforced on actual decode/render paths (a=T).

@anthonykim1 anthonykim1 marked this pull request as ready for review February 16, 2026 06:18
@anthonykim1 anthonykim1 requested review from Tyriar and jerch February 16, 2026 06:19
@Tyriar
Copy link
Member

Tyriar commented Feb 16, 2026

Let's merge and fix any other issues in separate smaller prs

@Tyriar Tyriar merged commit 3a9bfa9 into master Feb 16, 2026
12 checks passed
@Tyriar Tyriar deleted the anthonykim1/scaffoldKittyAddon branch February 16, 2026 18:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants