Skip to content

feat(deepagents): support multimodal files for backends#298

Merged
Colin Francis (colifran) merged 74 commits intomainfrom
colifran/multimodal-refactor
Mar 18, 2026
Merged

feat(deepagents): support multimodal files for backends#298
Colin Francis (colifran) merged 74 commits intomainfrom
colifran/multimodal-refactor

Conversation

@colifran
Copy link
Copy Markdown
Contributor

@colifran Colin Francis (colifran) commented Mar 11, 2026

Description

Adds support for binary and multimodal files (images, PDFs, audio, video, etc.) across all backend implementations, with a versioned protocol layer that returns structured results instead of plain values or arrays.

Changes

Protocol

  • Introduced BackendProtocolV2 with structured Result return types:
    • ReadResult — for read() operations
    • ReadRawResult — for readRaw() operations
    • GrepResult — for grepRaw() operations
    • LsResult — for lsInfo() operations
    • GlobResult — for globInfo() operations
  • Introduced SandboxBackendProtocolV2 extending BackendProtocolV2; deprecated SandboxBackendProtocol
  • Deprecated v1 interfaces (BackendProtocol, SandboxBackendProtocol) moved to v1/; current interfaces in v2/ for clear separation
  • Added AnyBackendProtocol union type and adaptBackendProtocol() for backward compatibility — public APIs accept either version and adapt internally
  • FileData split into FileDataV1 (legacy line array) and FileDataV2 (single string + mimeType, supports base64)
  • All Result types follow consistent pattern: { error?: string, [data]?: T } for explicit error propagation

Binary File Support

  • All backends (State, Store, Filesystem, Sandbox, Composite) store and retrieve binary files as base64
  • read_file tool returns typed multimodal content blocks (image, audio, video, file) for binary files
  • Binary reads capped at 10MB to stay within provider inline limits
  • MIME type detection via file extension, stored with v2 FileData

Middleware

  • Updated skills.ts middleware to handle LsResult from backend operations
  • Updated fs.ts middleware to handle LsResult and GlobResult with proper error checking

Provider Updates

  • QuickJS REPL: public API accepts AnyBackendProtocol, adapts to v2 internally via adaptBackendProtocol() — fully backward compatible
  • Node VFS: VfsSandbox updated to return LsResult and GlobResult instead of bare arrays

Backward Compatibility

  • v1 backends continue to work everywhere via adaptBackendProtocol() runtime detection
  • v2 backends correctly read and operate on existing v1 FileData (string[] content)
  • No breaking changes to public APIs — all accept AnyBackendProtocol

Tests

  • All backend tests updated to assert on Result types: composite.test.ts, filesystem.test.ts, state.test.ts, store.test.ts, sandbox.test.ts, local-shell.test.ts
  • Added readRaw() tests returning ReadRawResult across State, LocalShell, Composite, and Node VFS backends
  • utils.test.ts updated to test adaptBackendProtocol() with all Result types, including v1→v2 wrapping
  • Middleware tests (fs.test.ts, skills.test.ts) updated to mock backends returning Result types
  • New binary tests for State and Store backends covering upload, download, round-trip, read, grep, and pagination
  • New multimodal content block tests for read_file (image, audio, video, PDF, size limit enforcement)
  • Sandbox test: readRaw() now returns error in Result object instead of throwing
  • Added v1→v2 backward compatibility tests for State and Store backends (mixed v1/v2 data)
  • Manual e2e testing scripts for all 5 backend configurations plus v1→v2 migration simulation

Example:

multimodal

@changeset-bot
Copy link
Copy Markdown

changeset-bot bot commented Mar 11, 2026

🦋 Changeset detected

Latest commit: b6b7359

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 7 packages
Name Type
@langchain/node-vfs Patch
@langchain/quickjs Patch
@langchain/modal Patch
@langchain/sandbox-standard-tests Patch
deepagents Minor
deepagents-acp Patch
@deepagents/evals Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@colifran Colin Francis (colifran) changed the title feat: support multimodal content in backends feat(deepagents): support multimodal files for backends Mar 11, 2026
@colifran Colin Francis (colifran) force-pushed the colifran/multimodal-refactor branch from 516cedf to 4d75256 Compare March 11, 2026 19:50
@colifran Colin Francis (colifran) force-pushed the colifran/multimodal-refactor branch from 0ace2b2 to 6beb46c Compare March 12, 2026 20:44
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One nit, optional.

Great work 👍

@colifran Colin Francis (colifran) force-pushed the colifran/multimodal-refactor branch from 10210cc to c8b6f3e Compare March 13, 2026 23:06
Copy link
Copy Markdown
Member

@hntrl Hunter Lovell (hntrl) left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

on the mime type thing:

  • I don't think the filename heuristic to determine what content block gets made is our best approach. E.g. in S3 I know I can attach a content-type header to a file but have an ambiguous file name

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

Copy link
Copy Markdown
Member

@hntrl Hunter Lovell (hntrl) left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pending a fix to JsonPlusSerializer, lgtm!

@pkg-pr-new
Copy link
Copy Markdown

pkg-pr-new bot commented Mar 17, 2026

Open in StackBlitz

npm i https://pkg.pr.new/deepagents-acp@298
npm i https://pkg.pr.new/deepagents@298
npm i https://pkg.pr.new/@langchain/sandbox-standard-tests@298

commit: b6b7359

@colifran Colin Francis (colifran) force-pushed the colifran/multimodal-refactor branch from c564a9d to acfced7 Compare March 17, 2026 23:51
@colifran Colin Francis (colifran) force-pushed the colifran/multimodal-refactor branch from acfced7 to 3e99885 Compare March 18, 2026 01:58
@colifran Colin Francis (colifran) merged commit aab678a into main Mar 18, 2026
15 checks passed
@colifran Colin Francis (colifran) deleted the colifran/multimodal-refactor branch March 18, 2026 02:06
@github-actions github-actions bot mentioned this pull request Mar 18, 2026
Colin Francis (colifran) added a commit that referenced this pull request Mar 23, 2026
…" (#352)

* Revert "feat(deepagents): support multimodal files for backends (#298)"

This reverts commit aab678a.

* regen lock file

* fix empty string id check for sandbox protocols
Colin Francis (colifran) added a commit that referenced this pull request Mar 23, 2026
Colin Francis (colifran) added a commit that referenced this pull request Mar 23, 2026
Colin Francis (colifran) added a commit that referenced this pull request Mar 23, 2026
Colin Francis (colifran) added a commit that referenced this pull request Mar 24, 2026
* update backend protocol interface, types, and utils

* refactor state backend

* refactor store backend

* clean up

* unit tests

* refactor filesystem

* unit tests

* skip binary files in literal search for filesystem backend

* refactor base sandbox

* base sandbox unit tests

* refactor local shell backend

* refactor composite backend

* composite and local shell backend changes

* refactor fs middleware

* unit tests and fixed issue where createFileData was removed - this would be breaking

* refactor acp filesystem backend

* backend protocol v2

* simplify createFileData

* docstrings

* adapt backend protocol tests

* sandbox protocol v2

* format

* standard tests

* fix tests

* fix tests

* fix tests

* fix node vfs

* lint fix

* empty commit

* fix download files to handle binary

* any backend protocol

* lint

* make backend protocol v2 extend backend protocol

* make backend unknown type for is sandbox backend check

* add changeset

* add max binary file size

* add svg

* separate v1 and v2

* format

* lint

* format

* update glob, ls, read raw return types

* unit test fixes

* fix tests

* restore standard-tests package

* fix integ tests - make standard-tests backward compatible

* remove deleted sandbox adapter

* standard-test refactor for backwards compat

* linting

* fix tests

* fix integ test

* type guards and improved guard checks

* store mime type with v2 file data

* make explicit v1 types

* fix locall shell int types

* fix bug

* edge case

* update providers

* lint

* fix repl

* read raw tests

* support string or unint8arrays

* clean comments, docstrings, and fix issue where we were throwing a string

* don't re-wrap uint8arrays

* clean up test names

* fix quickjs

* poison pill and uint8array issues

* add back video and audio support

* bump langgraph-checkpoint version for json plus serializer changes

* fix lock after merge with main

* remove explicit cast

* use instance of Uint8Array for FileDataV2 schema

* update changeset

* regen lock
Colin Francis (colifran) added a commit that referenced this pull request Mar 24, 2026
…" (#352)

* Revert "feat(deepagents): support multimodal files for backends (#298)"

This reverts commit aab678a.

* regen lock file

* fix empty string id check for sandbox protocols
Hunter Lovell (hntrl) pushed a commit that referenced this pull request Mar 24, 2026
Hunter Lovell (hntrl) pushed a commit that referenced this pull request Apr 1, 2026
Colin Francis (colifran) added a commit that referenced this pull request Apr 2, 2026
* Revert "revert: "feat(deepagents): support multimodal files for backends (#298)" (#352)" (#353)

This reverts commit 03ea1c9.

* revert: "revert: "feat(sdk): add async subagent middleware for remote LangGraph servers  (#323)" (#351)" (#354)

* Revert "revert: "feat(sdk): add async subagent middleware for remote LangGraph servers  (#323)" (#351)"

This reverts commit 367e43a.

* use any backend protocol

* Reapply "chore(deepagents): refactor backend method names - `lsInfo` -> `ls`, …" (#349) (#356)

This reverts commit 573479d.

* Reapply "chore(sdk): unify sync subagents and async subagents into a single pr…" (#348) (#355)

This reverts commit 96dc34c.

* chore: align alpha with main (#358)

* fix(deepagents): remove orphaned ToolMessages for Gemini compatibility (#335)

* fix(deepagents): remove orphaned ToolMessages for Gemini compatibility

* Fix ToolMessages for Gemini compatibility

---------

Co-authored-by: Christian Bromann <git@bromann.dev>

* fix(deepagents): throw on built-in tool collision (#330)

* add error

* Create big-horses-fail.md

* add config error class

* cr

---------

Co-authored-by: Christian Bromann <git@bromann.dev>

* fix(deepagents): use `crypto.randomUUID()` instead of uuid (#336)

* fix(deepagents): use crypto.randomUUID() instead of uuid

* update pnpm-lock

* Create grumpy-weeks-wave.md

* Update libs/deepagents/src/middleware/fs.int.test.ts

* feat(deepagent): add LangSmithSandbox (#324)

* feat(deepagent): add LangSmithSandbox

* Change deepagents version from patch to minor

* format

* fix tests

* format

* make it a patch

* cr

* cr

* fix

* cr

* chore: version packages (#321)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* regen lockfile

* fix langsmith tests so that they use backend protocol v2 methods

* format

---------

Co-authored-by: pawel-twardziak <pawel.twardziak.dev@gmail.com>
Co-authored-by: Christian Bromann <git@bromann.dev>
Co-authored-by: Maahir Sachdev <maahir.sachdev@langchain.dev>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* feat(deepagents): add completion notifier middleware for async subagents (#334)

Port of langchain-ai/deepagents#2119 to TypeScript. Adds a
createCompletionNotifierMiddleware that async subagents can use to
proactively notify their supervisor when they complete or error,
closing the gap where the supervisor only learns about completion
when someone calls check_async_task.

- New createCompletionNotifierMiddleware with afterAgent and
  wrapModelCall hooks
- Uses @langchain/langgraph-sdk Client to send runs.create() to the
  supervisor's thread
- Reads parent_thread_id from subagent state (injected by
  start_async_task)
- Derives task_id from runtime.configurable.thread_id
- Silently no-ops if parent context is missing
- Guards against duplicate notifications
- 22 unit tests covering all hooks, edge cases, and error paths

fix(deepagents): make url required in completion notifier (no ASGI in JS)

JS does not have ASGI transport like Python, so the url parameter
must be provided explicitly. Removed all ASGI references from docs
and the localhost fallback default.

fix(deepagents): throw on built-in tool collision (#330)

* add error

* Create big-horses-fail.md

* add config error class

* cr

---------

Co-authored-by: Christian Bromann <git@bromann.dev>

fix(deepagents): use `crypto.randomUUID()` instead of uuid (#336)

* fix(deepagents): use crypto.randomUUID() instead of uuid

* update pnpm-lock

* Create grumpy-weeks-wave.md

* Update libs/deepagents/src/middleware/fs.int.test.ts

feat(deepagent): add LangSmithSandbox (#324)

* feat(deepagent): add LangSmithSandbox

* Change deepagents version from patch to minor

* format

* fix tests

* format

* make it a patch

* cr

* cr

* fix

* cr

regen lockfile

linting

linting

add missing url property

chore: version packages (#321)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

changeset

regen lockfil

* chore: enter alpha pre-release

* chore: target alpha for releases

* chore: version packages (alpha) (#359)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* chore(deepagents): extend supported backend file types (#363)

* extend supported file types

* Create strong-tigers-share.md

---------

Co-authored-by: Hunter Lovell <40191806+hntrl@users.noreply.github.com>

* chore(deepagents): implement async subagents + use stream example (#360)

* async subagents + use stream example

* fix lockfile

* format

* linting

* readme and linting

* format

* proactively send responses when subagents complete

* better examples

* feat(deepagents): rename completion notifier to completion callback and align with Python (#361)

* feat(deepagents): rename completion notifier to completion callback and align with Python PR

- Rename completion_notifier.ts -> completion_callback.ts to match Python's
  completion_callback.py naming
- Rename exports: createCompletionNotifierMiddleware -> createCompletionCallbackMiddleware,
  CompletionNotifierOptions -> CompletionCallbackOptions
- Rename state key: parent_thread_id -> callbackThreadId, option: parentGraphId -> callbackGraphId
- Make url optional (Python allows omitting for ASGI transport)
- Match Python's strict error behavior: throw on empty messages, non-AIMessage types,
  and missing callbackThreadId
- Add truncation suffix with task_id hint for long messages
- Use generic error message in wrapModelCall (don't leak error details)
- Remove duplicate notification guard (Python notifies on every error)
- Add extractCallbackContext to async_subagents.ts: injects callbackThreadId
  into subagent input state when launching via start_async_task
- Add tests for extractCallbackContext and callback context injection

* cr

* Rename completion notifier to completion callback

Renamed completion notifier to completion callback for consistency with Python.

* fix(sdk): `AsyncTask` `updatedAt` field doesn't update on task status changes (#400)

* update updatedAt field to change on any task update

* added changeset

* chore: set up self hosted async subagent example (#399)

* self hosted async subagent example

* with postgres

* formatting

* eslint disable no console

* fix dockerfile and readme

* Update examples/async-subagent-server/server.ts

Co-authored-by: Christian Bromann <git@bromann.dev>

---------

Co-authored-by: Christian Bromann <git@bromann.dev>

* chore(sdk): update async subagent middleware for agent protocol (#394)

* update async subagent middleware for agent protocol

* add changeset

* Update libs/deepagents/src/middleware/async_subagents.ts

Co-authored-by: Hunter Lovell <40191806+hntrl@users.noreply.github.com>

* Update libs/deepagents/src/middleware/async_subagents.ts

Co-authored-by: Hunter Lovell <40191806+hntrl@users.noreply.github.com>

* Update libs/deepagents/src/middleware/async_subagents.ts

Co-authored-by: Hunter Lovell <40191806+hntrl@users.noreply.github.com>

* differentiate agent protocol

---------

Co-authored-by: Hunter Lovell <40191806+hntrl@users.noreply.github.com>

* chore(repo): migrate linting and formatting to oxc tooling (#391)

* chore(repo): migrate linting and formatting to oxc tooling

* cr

* cr

* chore(lint): clean up console disables for oxlint

* cr

* Apply suggestions from code review

Co-authored-by: Christian Bromann <git@bromann.dev>

---------

Co-authored-by: Christian Bromann <git@bromann.dev>

* refactor(deepagents): clean up createDeepAgent middleware wiring (#392)

* refactor(deepagents): clean up createDeepAgent middleware wiring

* fix(deepagents): avoid duplicate HITL middleware on subagents

* add comments, remove iife

* Create ten-masks-flow.md

* fix(deepagents): align prompt templates with runtime behavior (#393)

* fix(deepagents): align prompt templates with runtime behavior

* chore: add changeset for prompt alignment fixes

* cr

* cr

* fix store backend and tests

* lint

* fix rests and resolveBackend

* lint

* fix failing tests

* revert adapt resolve backend

* fix resolve backend

* better variable name

* fix backend factory to return a maybe promise

* mark resolve backend as internal

* format

---------

Co-authored-by: Colin Francis <131073567+colifran@users.noreply.github.com>
Co-authored-by: pawel-twardziak <pawel.twardziak.dev@gmail.com>
Co-authored-by: Christian Bromann <git@bromann.dev>
Co-authored-by: Maahir Sachdev <maahir.sachdev@langchain.dev>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Colin Francis <colin.francis@langchain.dev>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants