Skip to content

michi883/agentic-sandbox

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Agentic Sandbox

Multiple strategy agents explore competing strategies using web signals, then the Sandbox recommends what to test next.

Given a goal, several strategy agents compete using different approaches. Each agent consumes web signals, weighs tradeoffs, predicts likely outcomes, and proposes a next experiment. The Sandbox scores the agents, ranks them, and recommends the strongest path — before you launch in public.

Canonical demo

  • Product: TileShift — a minimalist browser puzzle where shifting tiles reveal a hidden pattern.
  • Goal: Get the first 10 users.
  • Competing agents: Gameplay-first · Technical-first · Founder-story.
  • Signals: 13 mock web signals across 5 channels — Reddit, Hacker News, Product Hunt, X, LinkedIn.

The agents run over shared web signals; the scoring engine rates each on six deterministic criteria (early-user potential, audience fit, clarity, friction, learning value, risk, in a tuned weighted sum where early-user pull dominates); the tournament ranks them and produces a recommendation, next experiment, and draft message.

It runs as a live, uncertain tournament — not a static readout

Evidence arrives round by round (Hacker News first, where novelty rewards the Technical-first agent early), then the playable and friction signals land — so the early leader is overtaken and the winner is genuinely undecided until the end. Each agent then adapts to its biggest objection (recovering risk and lifting its weakest criterion), and the final ranking is decided on that adapted score. The engine emits the whole evolution — per-round score trajectories, a signal feed showing who each signal helps or hurts, and the strategy updates each signal triggered — so the UI can replay it honestly.

Architecture

mock web signals → strategy agents → scoring engine → tournament → recommendation UI
Layer File Role
Signal provider src/signalProvider.js The Nimble swap point. Returns mock signals today.
Strategy agents src/agents.js Deterministic agents: select supporting evidence, weigh support & risk, predict outcome, carry a draft.
Scoring engine src/scorer.js Six-criteria deterministic scoring, combined as a tuned weighted sum.
Tournament src/tournament.js Runs all agents over each round, scores trajectories, applies adaptation, ranks, picks the winner, and builds the recommendation + next experiment.
API server.js Zero-dependency Node service (POST /api/tournament) for Cloud Run.
UI public/ Live flow: Setup → Tournament (signals arrive, agents adapt, leaderboard shifts) → Recommendation.
Data data/mockSignals.json, data/strategies.json Mock live-web evidence + agent definitions.

Run it

# CLI — prints the full tournament for TileShift (no browser needed)
npm run tournament

# Full app (API + UI) on http://localhost:8080
npm start          # alias for: node server.js

The server exposes GET /healthz, POST /api/tournament (body { product, goal, agents? }), GET /api/tournament (defaults, handy for quick checks), and serves the static UI from public/. It has zero runtime dependencies.

Deploy (Firebase Hosting + Cloud Run): the container (Dockerfile) runs the engine on Cloud Run; firebase.json rewrites /api/** to the service and serves public/ as the static UI.

Signal source — honest status

Current prototype uses mocked live web signals to validate the multi-agent strategy engine. The signal provider is designed to be swapped with Nimble Search, Extract, Crawl, or Web Search Agents for real-time web intelligence.

Only src/signalProvider.js knows where signals come from. To go live, implement getSignals(context) against Nimble and return the same signal shape — the agents, scorer, and tournament need no changes.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors