Skip to content

fix(registry): render action params as JSON Schema instead of zod _def dump#33

Merged
unadlib merged 4 commits into
webllm:mainfrom
caffeinum:fix/prompt-jsonschema
May 8, 2026
Merged

fix(registry): render action params as JSON Schema instead of zod _def dump#33
unadlib merged 4 commits into
webllm:mainfrom
caffeinum:fix/prompt-jsonschema

Conversation

@caffeinum

@caffeinum caffeinum commented May 6, 2026

Copy link
Copy Markdown
Contributor

Disclaimer: This is AI-generated, but I ran into this issue and tested the fix in prod. same in #34

Parity with Python upstream

The TypeScript port of RegisteredAction.promptDescription was JSON-stringifying zod's private _def AST, exposing internals like innerType/defaultValue/def to the LLM. Python upstream uses pydantic.BaseModel.model_json_schema() and avoids this entirely. This PR brings the TS implementation to parity by switching to z.toJSONSchema() (zod v4, already in the dep tree).

Reference: browser_use/tools/registry/views.py:31-56 on browser-use/browser-use main. Specifically views.py:33:

schema = self.param_model.model_json_schema()

Problem

For example, ScrollActionSchema was rendered to the LLM as:

{
  "down": {
    "type": "default",
    "innerType": { "def": { "type": "boolean" } },
    "defaultValue": true
  },
  "num_pages": {
    "type": "default",
    "innerType": { "def": { "type": "number" } },
    "defaultValue": 1
  }
}

This is wrong-tier information for the model: it shows zod internals rather than a parameter schema. Worse, the adjacency of "defaultValue": true (for down) right next to num_pages plausibly causes the model to copy the wrong default — we observed the model emitting num_pages: <boolean> repeatedly in real runs.

Fix

Switch promptDescription to render the param schema via z.toJSONSchema(...). The same ScrollActionSchema now renders as:

{
  "down": { "type": "boolean", "default": true },
  "num_pages": { "type": "number", "default": 1 }
}

Clean, standard, and unambiguous about which default belongs to which field.

Impact

In a downstream consumer (canary-env), this fix combined with the retry-feedback fix (#34) took an eval from "BA bails at step 5 with done(success=false) after a 3-fail cascade" to "BA completes at step 20 with run_complete".

Standalone effect of this PR: removes the _def cascade entirely. Residual num_pages: <boolean> errors still happen ~5x/run but now self-correct via retry feedback (#34).

Tests

  • Added test/promptDescription.test.ts covering the JSON Schema rendering.
  • Full suite: 968 pass / 0 fail.

Caveats

  • The TS fix produces somewhat richer JSON Schema output than Python's terser name=type (description) per param. Open to matching the Python format if reviewers prefer.
  • Relies on z.toJSONSchema from zod v4, which is already a dep. No new dependency.
  • The prompt text the LLM sees changes shape — anything that was prompt-scraping the old _def dump will need to read JSON Schema instead. (We don't believe anything was; this was an internal accident, not a contract.)

🤖 Generated with Claude Code

…f dump

RegisteredAction.promptDescription previously serialized each property of
the zod object schema by stringifying its private `_def` AST. For schemas
with `.default()` wrappers (e.g. ScrollActionSchema's `down` and
`num_pages`), the LLM would see something like:

  "num_pages": {"type":"default","innerType":{"def":{"type":"number"},...},"defaultValue":1},
  "down":      {"type":"default","innerType":{"def":{"type":"boolean"},...},"defaultValue":true}

The model would plausibly copy the nearby `defaultValue: true` and emit a
boolean for `num_pages`. The schema correctly rejected, the same prompt
was fed back, and the same mistake recurred until `max_failures=3` tripped.

Replace the `_def` walk with `z.toJSONSchema(schema, {unrepresentable:'any'})`
(zod v4 native), strip the `$schema` dialect URL, and apply the existing
skipKeys filter to both `properties` and `required`. The LLM now sees:

  {"type":"object","properties":{"down":{"default":true,"type":"boolean"},
   "num_pages":{"default":1,"type":"number"},...}, "required":[...],
   "additionalProperties":false}

— a familiar, well-known JSON Schema shape with no zod-internal leakage.

Surrounding `${description}: \n{${name}: ...}` envelope is unchanged so the
LLM sees the same outer layout.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@caffeinum caffeinum marked this pull request as ready for review May 6, 2026 01:44
@caffeinum caffeinum requested a review from unadlib as a code owner May 6, 2026 01:44
Comment thread src/controller/registry/views.ts Outdated
schema: ZodTypeAny,
skipKeys: Set<string>
): Record<string, unknown> {
const raw = z.toJSONSchema(schema, { unrepresentable: 'any' }) as Record<

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should use input-mode JSON Schema generation: z.toJSONSchema(schema, { io: 'input', unrepresentable: 'any' }).

Without io: 'input', Zod treats .default() fields as required in the generated schema. That makes prompts say defaulted action params like scroll.down, scroll.num_pages, and done.success are required, even though they are optional inputs at runtime. Since this schema is shown to the LLM as the action input contract, it should describe accepted input rather than parsed output.

Per @unadlib's review on PR webllm#33: z.toJSONSchema without `io: 'input'`
treats `.default()` fields as required in the generated schema. The
schema is shown to the LLM as the action input contract, so it should
describe accepted input, not parsed output. Without this fix, defaulted
fields like scroll.num_pages and done.success show as required to the
model, misleading the input contract.

Also adds scripts/dump-schema.ts so the rendered schema for any
registered action can be inspected from the CLI:

  bun run scripts/dump-schema.ts done
  bun run scripts/dump-schema.ts scroll
  bun run scripts/dump-schema.ts --all
  bun run scripts/dump-schema.ts --list

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-Authored-By: unadlib <unadlib@noreply.github.com>
Comment thread package.json Outdated
caffeinum and others added 2 commits May 7, 2026 17:07
Refactors scripts/dump-schema.ts to call RegisteredAction.getPromptJsonSchema()
instead of reimplementing z.toJSONSchema. The new method (and the underlying
renderParamsJsonSchema helper) are exported from src/controller/registry/views.ts
so the script exercises the exact code path RegisteredAction.promptDescription()
uses to render the prompt for the LLM.

Also drops the pnpm dump-schema package.json shortcut — the script stays
runnable via bun/tsx directly.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@unadlib unadlib merged commit 686a3c4 into webllm:main May 8, 2026
@unadlib

unadlib commented May 8, 2026

Copy link
Copy Markdown
Member

browser-use v0.6.1 has been released.

@caffeinum caffeinum deleted the fix/prompt-jsonschema branch May 19, 2026 18:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants