Note
This is an AI-generated research report. All text and code in this report was created by an LLM (Large Language Model). For more information on how these reports are created, see the main research repository.
This investigation explores the Vibium browser automation project and implements a Python client library compatible with the existing Go binary and Node.js client.
Vibium is browser automation infrastructure built for AI agents. It provides:
- A single Go binary ("clicker") that handles browser lifecycle and WebDriver BiDi protocol
- A Node.js client library for JavaScript developers
- MCP server for LLM agent integration
This investigation adds a Python client library with both sync and async APIs.
┌─────────────────────────────────────────────────────────────┐
│ LLM / Agent │
│ (Claude Code, Codex, Gemini, Local Models) │
└─────────────────────────────────────────────────────────────┘
▲
│ MCP Protocol (stdio)
▼
┌─────────────────────┐
│ Vibium Clicker │
│ (Go Binary) │
│ │
│ ┌───────────────┐ │
│ │ MCP Server │ │
│ └───────▲───────┘ │ ┌──────────────────┐
│ │ │ │ │
│ ┌───────▼───────┐ │WebSocket│ │
│ │ BiDi Proxy │ │◄───────►│ Chrome Browser │
│ └───────────────┘ │ BiDi │ │
│ │ │ │
└─────────────────────┘ └──────────────────┘
▲
│ WebSocket BiDi :9515
▼
┌─────────────────────────────────────────────────────────────┐
│ Client Libraries (JS/TS, Python) │
└─────────────────────────────────────────────────────────────┘
The clicker binary (clicker/) is the core automation engine:
| Component | Path | Description |
|---|---|---|
| CLI | cmd/clicker/main.go |
Cobra-based CLI with commands: serve, mcp, navigate, click, etc. |
| BiDi Client | internal/bidi/ |
WebDriver BiDi protocol implementation |
| Browser Launcher | internal/browser/ |
Chrome/chromedriver management |
| Proxy Server | internal/proxy/ |
WebSocket server routing messages between client and browser |
| MCP Server | internal/mcp/ |
JSON-RPC 2.0 over stdio for LLM integration |
| Actionability | internal/features/ |
Auto-wait and visibility checks |
The client communicates with the clicker via WebSocket using WebDriver BiDi protocol:
1. Start clicker serve process:
clicker serve --headless
# Outputs: Server listening on ws://localhost:95152. Connect WebSocket and send commands:
// Get browsing context
{"id": 1, "method": "browsingContext.getTree", "params": {}}
// Navigate to URL
{"id": 2, "method": "browsingContext.navigate", "params": {
"context": "<context-id>",
"url": "https://example.com",
"wait": "complete"
}}3. Custom Vibium extension commands:
// Find element with auto-wait
{"id": 3, "method": "vibium:find", "params": {
"context": "<context-id>",
"selector": "h1",
"timeout": 30000
}}
// Response: {"id": 3, "type": "success", "result": {"tag": "h1", "text": "Hello", "box": {...}}}
// Click with actionability checks
{"id": 4, "method": "vibium:click", "params": {
"context": "<context-id>",
"selector": "button",
"timeout": 30000
}}
// Type with actionability checks
{"id": 5, "method": "vibium:type", "params": {
"context": "<context-id>",
"selector": "input",
"text": "Hello World",
"timeout": 30000
}}The vibium-python library provides both sync and async APIs matching the Node.js client design.
cd vibium-python
uv syncfrom vibium_python import browser
vibe = browser.launch(headless=True, executable_path="/path/to/clicker")
vibe.go("https://example.com")
element = vibe.find("h1")
print(f"Found: {element.tag} - {element.text}")
button = vibe.find("button")
button.click()
input_elem = vibe.find("input")
input_elem.type("Hello World")
png_data = vibe.screenshot()
vibe.quit()import asyncio
from vibium_python import async_browser
async def main():
vibe = await async_browser.launch(headless=True, executable_path="/path/to/clicker")
await vibe.go("https://example.com")
h1 = await vibe.find("h1")
print(f"Found: {h1.text}")
await vibe.quit()
asyncio.run(main())vibium-python/
├── src/vibium_python/
│ ├── __init__.py # Exports all APIs
│ ├── browser.py # Sync launch()
│ ├── vibe.py # Sync Vibe class
│ ├── element.py # Sync Element class
│ ├── bidi.py # Sync BiDi WebSocket client
│ ├── clicker.py # Clicker process management
│ ├── async_browser.py # Async launch()
│ ├── async_vibe.py # Async Vibe class
│ ├── async_element.py # Async Element class
│ └── async_bidi.py # Async BiDi WebSocket client
├── tests/
│ ├── conftest.py # Fixtures (HTTP server, clicker path)
│ ├── fixtures/ # Test HTML files
│ └── test_*.py # Test files
└── pyproject.toml
cd vibium-python
uv run pytest -vAll 21 tests pass:
- 15 sync tests (launch, navigation, find, click, type, screenshot)
- 6 async tests (matching sync functionality)
-
Protocol Extension: Vibium extends WebDriver BiDi with custom
vibium:methods for actionability-aware operations -
Auto-wait: The
vibium:find,vibium:click, andvibium:typecommands poll for elements and check visibility/stability before acting -
Architecture Simplicity: The Go binary handles all browser complexity, clients just need WebSocket JSON messaging
-
MCP Integration: The same binary can run as an MCP server for LLM agent integration
vibium-python/- Complete Python client librarynotes.md- Investigation notes and protocol documentationREADME.md- This architectural guide