Visual verification CLI for AI agents. Captures browser screencasts and uses vision models to verify UI state changes.
Glimpsy connects to Chrome via DevTools, captures a sequence of screenshots over time, optionally executes JavaScript to trigger state changes, and uses Gemini to analyze what changed. It's designed for AI agents that need to verify their actions worked without human intervention.
- Node.js 20+
- Chrome running with remote debugging enabled
- Gemini API key (for model processing)
# Clone and install
git clone https://github.com/ratacat/glimpsy.git
cd glimpsy
npm install
# Build
npm run build
# Install globally (optional, enables `glimpsy` command)
npm install -g .Start Chrome with remote debugging:
# macOS
/Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome --remote-debugging-port=9222
# Linux
google-chrome --remote-debugging-port=9222
# Windows
chrome.exe --remote-debugging-port=9222cp .env.example .env
# Edit .env and add your Gemini API keyGet a Gemini API key from: https://aistudio.google.com/apikey
# Basic capture (5s, process with Gemini, print report)
glimpsy
# Custom duration and port
glimpsy --duration 10000 --port 9333
# Trigger a click and capture the result
glimpsy --anchor "document.querySelector('#submit').click()"
# Full verification flow
glimpsy \
--duration 3000 \
--anchor "document.querySelector('form').submit()" \
--task "Verify login flow" \
--baseline "Login form with email/password fields visible" \
--expected "Dashboard with welcome message and user avatar"
# Capture only, no model processing
glimpsy --duration 2000 --skip-model --retain --dir ./captures
# JSON output for programmatic use
glimpsy --json --anchor "window.scrollTo(0, 1000)"Connection
--port <number>- DevTools port (default: 9222)
Capture
--duration <ms>- Capture duration in milliseconds (default: 5000)--interval <ms>- Time between frames in milliseconds (default: 250)--format <png|jpeg>- Image format (default: png)--quality <0-100>- JPEG quality (default: 80)--dir <path>- Output directory for frames (default: OS temp dir)
Anchor
--anchor <js>- Inline JS to execute 100ms after capture starts--anchor-file <path>- Path to JS file to execute
Verification
--task <text>- What task/behavior is being verified--baseline <text>- Expected state before anchor fires--expected <text>- Expected outcome after anchor fires
Model
--provider <name>- Model provider (default: gemini)--model <name>- Model name (default: gemini-2.5-flash)--skip-model- Skip model processing, capture frames only
Output
--json- Output structured JSON instead of human-readable text--retain- Keep frame artifacts after processing--log-level <level>- Logging verbosity: debug|info|warn|error (default: info)
Configuration
--config <path>- Path to glimpsy.config file
Human-readable (default):
Start: A login form with email and password input fields. Submit button is enabled.
Finish: A dashboard view displaying a welcome message, user avatar in top-right corner.
Changes:
- Form submitted
- Page navigated from /login to /dashboard
- User avatar loaded
- Welcome message appeared
Matched: Dashboard with welcome message and user avatar
Unmet: (none)
Confidence: 92%
JSON (--json flag):
{
"triggered_at": "2024-01-15T10:30:00.000Z",
"duration_ms": 5000,
"frame_interval_ms": 250,
"anchor_response": {
"executed": true,
"fired_at": "2024-01-15T10:30:00.100Z",
"result": null,
"error": null
},
"frames": [...],
"model_report": {
"start_overview": "...",
"finish_overview": "...",
"changes": [...],
"matched_criteria": [...],
"unmet_criteria": [],
"confidence": 0.92
}
}Create glimpsy.config (JSON):
{
"port": 9222,
"output_dir": "/tmp/glimpsy",
"retain_outputs": false,
"model": {
"provider": "gemini",
"model": "gemini-2.5-flash",
"temperature": 0.2
},
"capture": {
"duration_ms": 5000,
"frame_interval_ms": 250,
"format": "png",
"quality": 80
},
"logging": {
"level": "info"
}
}CLI flags override config file values.
MIT