Skip to content

ratacat/glimpsy

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Glimpsy

Visual verification CLI for AI agents. Captures browser screencasts and uses vision models to verify UI state changes.

What It Does

Glimpsy connects to Chrome via DevTools, captures a sequence of screenshots over time, optionally executes JavaScript to trigger state changes, and uses Gemini to analyze what changed. It's designed for AI agents that need to verify their actions worked without human intervention.

Requirements

  • Node.js 20+
  • Chrome running with remote debugging enabled
  • Gemini API key (for model processing)

Installation

# Clone and install
git clone https://github.com/ratacat/glimpsy.git
cd glimpsy
npm install

# Build
npm run build

# Install globally (optional, enables `glimpsy` command)
npm install -g .

Chrome Setup

Start Chrome with remote debugging:

# macOS
/Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome --remote-debugging-port=9222

# Linux
google-chrome --remote-debugging-port=9222

# Windows
chrome.exe --remote-debugging-port=9222

Environment Setup

cp .env.example .env
# Edit .env and add your Gemini API key

Get a Gemini API key from: https://aistudio.google.com/apikey

Usage

# Basic capture (5s, process with Gemini, print report)
glimpsy

# Custom duration and port
glimpsy --duration 10000 --port 9333

# Trigger a click and capture the result
glimpsy --anchor "document.querySelector('#submit').click()"

# Full verification flow
glimpsy \
  --duration 3000 \
  --anchor "document.querySelector('form').submit()" \
  --task "Verify login flow" \
  --baseline "Login form with email/password fields visible" \
  --expected "Dashboard with welcome message and user avatar"

# Capture only, no model processing
glimpsy --duration 2000 --skip-model --retain --dir ./captures

# JSON output for programmatic use
glimpsy --json --anchor "window.scrollTo(0, 1000)"

CLI Options

Connection

  • --port <number> - DevTools port (default: 9222)

Capture

  • --duration <ms> - Capture duration in milliseconds (default: 5000)
  • --interval <ms> - Time between frames in milliseconds (default: 250)
  • --format <png|jpeg> - Image format (default: png)
  • --quality <0-100> - JPEG quality (default: 80)
  • --dir <path> - Output directory for frames (default: OS temp dir)

Anchor

  • --anchor <js> - Inline JS to execute 100ms after capture starts
  • --anchor-file <path> - Path to JS file to execute

Verification

  • --task <text> - What task/behavior is being verified
  • --baseline <text> - Expected state before anchor fires
  • --expected <text> - Expected outcome after anchor fires

Model

  • --provider <name> - Model provider (default: gemini)
  • --model <name> - Model name (default: gemini-2.5-flash)
  • --skip-model - Skip model processing, capture frames only

Output

  • --json - Output structured JSON instead of human-readable text
  • --retain - Keep frame artifacts after processing
  • --log-level <level> - Logging verbosity: debug|info|warn|error (default: info)

Configuration

  • --config <path> - Path to glimpsy.config file

Output

Human-readable (default):

Start: A login form with email and password input fields. Submit button is enabled.
Finish: A dashboard view displaying a welcome message, user avatar in top-right corner.

Changes:
  - Form submitted
  - Page navigated from /login to /dashboard
  - User avatar loaded
  - Welcome message appeared

Matched: Dashboard with welcome message and user avatar
Unmet: (none)
Confidence: 92%

JSON (--json flag):

{
  "triggered_at": "2024-01-15T10:30:00.000Z",
  "duration_ms": 5000,
  "frame_interval_ms": 250,
  "anchor_response": {
    "executed": true,
    "fired_at": "2024-01-15T10:30:00.100Z",
    "result": null,
    "error": null
  },
  "frames": [...],
  "model_report": {
    "start_overview": "...",
    "finish_overview": "...",
    "changes": [...],
    "matched_criteria": [...],
    "unmet_criteria": [],
    "confidence": 0.92
  }
}

Configuration File

Create glimpsy.config (JSON):

{
  "port": 9222,
  "output_dir": "/tmp/glimpsy",
  "retain_outputs": false,
  "model": {
    "provider": "gemini",
    "model": "gemini-2.5-flash",
    "temperature": 0.2
  },
  "capture": {
    "duration_ms": 5000,
    "frame_interval_ms": 250,
    "format": "png",
    "quality": 80
  },
  "logging": {
    "level": "info"
  }
}

CLI flags override config file values.

License

MIT

About

A lightning fast AI 'Eyes On Browser' tool for extending your development agents context horizon.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors