BrowserTrace: local trace viewer for Browser Use failures #4816

aaronlab · 2026-05-10T12:30:24Z

aaronlab
May 10, 2026

I built BrowserTrace, a small MIT-licensed local trace viewer for failed Browser Use and browser-agent runs.

Why I am sharing it here: BrowserTrace includes a Browser Use run-hook path for apps that call:

await agent.run(on_step_start=..., on_step_end=...)

It also keeps the callback-style attach_tracer(agent, ...) path for agents that expose register_new_step_callback or compatible callback attributes.

A minimal run-hook setup looks like this:

from browser_use import Agent
from browsertrace import Tracer
from browsertrace.integrations.browser_use import create_run_hooks

tracer = Tracer()
agent = Agent(task="...", llm=...)
hooks = create_run_hooks(tracer, name="browser-use run")

with hooks:
    await agent.run(on_step_start=hooks.on_step_start, on_step_end=hooks.on_step_end)

BrowserTrace keeps traces local by default and records the Browser Use failure timeline when fields are available: URL, screenshot flag, latest thought, model action, extracted content, status, and errors. It can also export a standalone public-safe HTML trace with prompts/model I/O, screenshots, and URLs omitted.

Quick no-API trial path from PyPI:

uvx --from "browsertrace[ui]" browsertrace doctor
uvx --from "browsertrace[ui]" browsertrace demo
uvx --from "browsertrace[ui]" browsertrace

Persistent install from PyPI:

pip install "browsertrace[ui]"

Repo: https://github.com/aaronlab/browsertrace
PyPI: https://pypi.org/project/browsertrace/
Browser Use guide: https://aaronlab.github.io/browsertrace/browser-use-debugging.html

Feedback I am looking for from Browser Use users: when a run fails, which fields matter most for debugging: task text, memory, extracted content, selected element, retry state, screenshots, model actions, final result, or something else?

No star/upvote ask; I am trying to make the Browser Use failure report useful before adding more adapter surface.

armorer-labs · 2026-05-12T20:59:47Z

armorer-labs
May 12, 2026

This is a useful direction. Browser agents are one of those places where the final answer is usually not enough to debug the failure.

For a local trace viewer, I would make a few things first-class:

one stable run_id for the whole browser task
step-level records: URL, title, action chosen, target element summary, model/tool output, latency, and error
screenshot references, but not as the only evidence
a diffable final result: what was extracted, clicked, submitted, or changed
retry/repair attempts linked back to the failed step they were trying to fix
version metadata: browser-use version, model/provider, prompt/template version, and config

The most useful viewer is probably not just “watch the run.” It is “compare this failed run to the last successful run and see where the path diverged.” That comparison view would be valuable for anyone trying to turn browser-use from a demo into a repeatable workflow.

0 replies

aaronlab · 2026-05-12T21:06:32Z

aaronlab
May 12, 2026
Author

Thanks, this is useful feedback. I agree that for Browser Use debugging the real workflow is often “where did this fail relative to the last passing run?”, not only replaying one trace.

I opened a BrowserTrace issue to track the first local version: aaronlab/browsertrace#369

For v0.1 I would keep it explicit and conservative:

compare a failed run with a selected successful run, not an automatic global baseline
use Browser Use / BrowserTrace / model / prompt / config metadata so the UI does not imply two unrelated runs are comparable
highlight the first divergent step by URL/title/action/target element summary/model output/error
show retry/repair attempts as linked evidence around the failed step
include a diffable final result when extracted/clicked/submitted/changed output is available

The detail I am still sorting out is the Browser Use final-result shape: separate fields for extracted content, tool output, and final result, or one normalized summary. If you have a concrete pass/fail pair you would expect to compare, that would help make the UI less vague.

0 replies

aaronlab · 2026-05-13T00:08:06Z

aaronlab
May 13, 2026
Author

Follow-up: I shipped the first small version of the failed-vs-good comparison path in BrowserTrace v0.1.19.

It is intentionally explicit for now:

browsertrace compare <failed_run_id> <success_run_id>
browsertrace compare <failed_run_id> <success_run_id> --json

The first slice compares existing step fields (action, url, status, and error) and reports the first divergent step between two selected runs. It does not yet auto-pick a baseline, change the SQLite schema, or add a UI diff view.

Release notes:
https://github.com/aaronlab/browsertrace/releases/tag/v0.1.19

The next Browser Use-specific question is still the final-result shape: for a pass/fail pair, would you expect comparison to separate extracted content, tool output, final result, and retry/repair attempts, or normalize those into one summary first?

A concrete failed run + known-good run shape would help keep the UI honest. No stars/upvotes requested; I am looking for workflow feedback from people turning Browser Use runs into repeatable workflows.

0 replies

aaronlab · 2026-05-13T09:43:08Z

aaronlab
May 13, 2026
Author

Small follow-up: BrowserTrace v0.1.20 now exposes the same failed-vs-good comparison payload through the local UI server too:

GET /api/compare/<failed_run_id>/<success_run_id>

So the current Browser Use debugging path is:

browsertrace compare <failed_run_id> <success_run_id>
browsertrace compare <failed_run_id> <success_run_id> --json
curl http://127.0.0.1:3000/api/compare/<failed_run_id>/<success_run_id>

The API is meant for local dashboards, scripts, or automation preflight checks that want the first divergent action, URL, status, or error before opening the full trace UI.

Useful feedback is still very concrete: for real Browser Use failed-vs-good pairs, should the next comparison field be final result, extracted content, tool output, retry/repair attempts, selected element summaries, or version/config metadata?

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BrowserTrace: local trace viewer for Browser Use failures #4816

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 4 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

BrowserTrace: local trace viewer for Browser Use failures #4816

Uh oh!

Uh oh!

aaronlab May 10, 2026

Replies: 4 comments

Uh oh!

armorer-labs May 12, 2026

Uh oh!

aaronlab May 12, 2026 Author

Uh oh!

aaronlab May 13, 2026 Author

Uh oh!

aaronlab May 13, 2026 Author

aaronlab
May 10, 2026

armorer-labs
May 12, 2026

aaronlab
May 12, 2026
Author

aaronlab
May 13, 2026
Author

aaronlab
May 13, 2026
Author