BotProbe

AI agent for evaluating Physical AI systems.

Describe the behavior you expect out of your physical AI system in plain English. BotProbe speaks to it, listens to its response, and returns a PASS or FAIL verdict with observations — no manual testing required.

What it does

BotProbe is a behavioral diagnostic agent designed for physical AI robots. You give it a natural language specification, and it runs an autonomous observe-evaluate loop:

Speaks a prompt to the bot via the laptop speaker
Listens to the bot's response via the laptop microphone
Analyzes the audio against your specification using Gemini
Returns a PASS/FAIL verdict with a description of what it actually observed

The agent decides how many observations are needed before issuing a verdict — simple tests complete in one round; complex behavioral tests may require more.

Use cases

Responsiveness — Does the bot respond when spoken to?
Interruption handling — When a user interrupts mid-sentence, does the bot stop within 1 second?
Audio quality — Is the bot's speech clear and audible?
Command recognition — Does the bot correctly execute a given voice command?
Conversation flow — Does the bot handle multi-turn exchanges correctly?
Edge case robustness — How does the bot behave under background noise or simultaneous speech?

Architecture

Browser
  |
  | HTTP (UI assets)
  v
Next.js  :3000   (Frontend Server)
  |
  | HTTP proxy  /api/*
  v
FastAPI  :8000   (Core Backend)
  |
  | google.genai SDK
  v
Gemini 2.5 Flash

The Next.js server proxies all /api/* requests to the Python backend via next.config.ts rewrites. The browser only ever talks to one origin.

How the agent loop works

User submits spec
      |
      v
Gemini decides: call speakToBot, recordAudio, or issue verdict
      |
      +-- speakToBot  -->  browser plays text via Web Speech API  -->  bot hears it
      |
      +-- recordAudio -->  browser records mic for N seconds       -->  captures bot's response
      |                    (optional: plays interrupt stimulus mid-recording)
      |
      v
Audio returned to Gemini as inline data (base64 WebM)
      |
      v
Gemini analyzes audio against spec  -->  PASS / FAIL + observations

Client-side tools (speakToBot, recordAudio) run in the browser. The agent loop is driven by useChat with sendAutomaticallyWhen: lastAssistantMessageIsCompleteWithToolCalls — no polling, no manual retries.

Tech stack

Layer	Technology
Frontend framework	Next.js 16 (App Router)
Frontend language	TypeScript
Styling	Tailwind CSS 4
Agent SDK	Vercel AI SDK v6 (`ai`, `@ai-sdk/react`)
Backend framework	FastAPI (Python)
LLM	Gemini 2.5 Flash via `google.genai`
Audio	Web MediaRecorder API + Web Speech API

Setup

Frontend

npm install

Backend

cd server
pip install -r requirements.txt

Create .env in server/:

GOOGLE_API_KEY=your_key_here

Running

Start the backend:

cd server
uvicorn main:app --port 8000 --reload

Start the frontend:

npm run dev

Open http://localhost:3000.

Hardware requirement: A physical bot that listens via microphone and responds via speaker — the laptop's own mic and speaker are used for bidirectional audio.

Project structure

botprobe/
  app/                      # Next.js App Router (frontend only)
  components/               # React components
  tools/                    # Client-side tool implementations (browser audio)
  lib/                      # Shared frontend utilities
  server/
    main.py                 # FastAPI app — POST /api/diagnose
    message_converter.py    # UIMessage[] -> Gemini Content[] conversion
    tools.py                # Pydantic tool schemas + Gemini FunctionDeclaration
    stream_writer.py        # AI SDK UI Message Stream Protocol helpers
    system_prompt.md        # Agent system prompt
    requirements.txt
  next.config.ts            # Proxies /api/* to FastAPI

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
app		app
components		components
lib		lib
public		public
server		server
tools		tools
.gitignore		.gitignore
AGENTS.md		AGENTS.md
CLAUDE.md		CLAUDE.md
README.md		README.md
apex_bot.py		apex_bot.py
eslint.config.mjs		eslint.config.mjs
next.config.ts		next.config.ts
package-lock.json		package-lock.json
package.json		package.json
postcss.config.mjs		postcss.config.mjs
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

BotProbe

What it does

Use cases

Architecture

How the agent loop works

Tech stack

Setup

Frontend

Backend

Running

Project structure

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

BotProbe

What it does

Use cases

Architecture

How the agent loop works

Tech stack

Setup

Frontend

Backend

Running

Project structure

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages