You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Note: this is AI-generated, but I have verified everything stated here.
Summary
The browser-use cloud LLM bu-2-0 deterministically emits a JSON boolean true in the index slot of input_text actions when the agent encounters multi-action batches involving login pages. This violates the documented schema (index: integer >= 0).
The bug is masked in the python SDK because pydantic.model_validate() in default lax mode silently coerces True → 1, so the agent types into element-index 1 (often visually close to the intended target on auth0-style pages) and recovers via subsequent retries. In the TypeScript port (webllm/browser-use), zod rejects strictly, leading to the agent looping until max_failures and force-quitting.
Reproduction
A standalone script that replays a captured request is at:
Note that only the firstinput_text.index is the buggy boolean; the second input_text (password field) uses a correct integer 3, and the click uses 6. This pattern is consistent across runs.
Pydantic's default lax mode coerces bool → int. The python BA then types into element 1 / 0, retries on failure, and often recovers. The TypeScript port (webllm/browser-use) using zod's default strict mode hard-rejects the boolean, leading to retry loops and max_failures force-quits.
Impact
In production, this affects every form-fill flow on auth0 / Universal Login pages. Observed on daytona.io, zeroentropy.dev, onkernel.com, browserbase.com. The TS port is fully blocked on these flows; the python port silently uses wrong indices and recovers.
Suggested fixes
Model-side (real fix): investigate why bu-2-0 emits booleans for the first input_text index slot in multi-action batches on login pages. The token-level pattern is consistent enough to suggest a fine-tuning artifact.
Python-side awareness: the python SDK silently masks this. Worth surfacing a warning when boolean→int coercion happens on action params, since the agent is then typing into the wrong element.
Note: this is AI-generated, but I have verified everything stated here.
Summary
The browser-use cloud LLM
bu-2-0deterministically emits a JSON booleantruein theindexslot ofinput_textactions when the agent encounters multi-action batches involving login pages. This violates the documented schema (index: integer >= 0).The bug is masked in the python SDK because
pydantic.model_validate()in default lax mode silently coercesTrue → 1, so the agent types into element-index 1 (often visually close to the intended target on auth0-style pages) and recovers via subsequent retries. In the TypeScript port (webllm/browser-use), zod rejects strictly, leading to the agent looping untilmax_failuresand force-quitting.Reproduction
A standalone script that replays a captured request is at:
Output (verbatim from one run, 6/6 attempts reproduced)
The full action batch returned by
bu-2-0for this request is:[ {"input_text": {"index": true, "text": "<TEST_USER_EMAIL>"}}, {"input_text": {"index": 3, "text": "<TEST_USER_PASSWORD>"}}, {"click": {"index": 6}} ]Note that only the first
input_text.indexis the buggy boolean; the secondinput_text(password field) uses a correct integer3, and theclickuses6. This pattern is consistent across runs.Why pydantic hides it
Verified locally:
Pydantic's default lax mode coerces
bool → int. The python BA then types into element 1 / 0, retries on failure, and often recovers. The TypeScript port (webllm/browser-use) using zod's default strict mode hard-rejects the boolean, leading to retry loops andmax_failuresforce-quits.Impact
In production, this affects every form-fill flow on auth0 / Universal Login pages. Observed on
daytona.io,zeroentropy.dev,onkernel.com,browserbase.com. The TS port is fully blocked on these flows; the python port silently uses wrong indices and recovers.Suggested fixes
Model-side (real fix): investigate why
bu-2-0emits booleans for the firstinput_textindex slot in multi-action batches on login pages. The token-level pattern is consistent enough to suggest a fine-tuning artifact.TS-side workaround (already implemented): fix(controller): coerce booleans to ints for action index fields (pydantic parity) webllm/browser-use#35 adds pydantic-parity boolean→int coercion for index fields in the TS port.
Python-side awareness: the python SDK silently masks this. Worth surfacing a warning when boolean→int coercion happens on action params, since the agent is then typing into the wrong element.
Environment
bu-2-0browser_agentfalse