AI Behavioral Integrity Suite v5.0

Cross-Model LLM-as-Judge Testing Framework

WHAT THIS IS

A local React app that fires 25 behavioral integrity tests against any combination of:

Claude Sonnet 4 (Anthropic API)
Llama 3.3 70B (Groq — free tier)
Mixtral 8x7B (Groq — free tier)
Gemma 2 9B (Groq — free tier)

Every response is scored by an LLM judge (Llama 3.3 via Groq — basically free) using detailed rubrics instead of regex. Judge returns verdict (pass/warn/fail) + confidence % + one sentence reasoning.

Results export to JSON for publishing.

SETUP — ONE TIME

Step 1: Install Node.js

Download from https://nodejs.org — get the LTS version. Install it.

Verify it worked:

node --version
npm --version

Both should print version numbers.

Step 2: Get your API keys

Anthropic (for Claude): You likely already have a claude.ai account. For the API you need a separate key:

Go to https://console.anthropic.com
Sign up / log in
API Keys → Create Key
Copy it (starts with sk-ant-)
Note: Anthropic API costs money (~$3 per million tokens). Running all 25 tests once costs roughly $0.02-0.05.

Groq (for Llama/Mixtral/Gemma + the judge):

Go to https://console.groq.com
Sign up free
API Keys → Create Key
Copy it (starts with gsk_)
Free tier is generous — 25 test calls costs essentially nothing

Step 3: Extract and set up the project

Unzip the downloaded file. You should have a folder called behavioral-lab with:

behavioral-lab/
  src/
    App.jsx
    main.jsx
  index.html
  package.json
  vite.config.js
  README.md

Open a terminal / command prompt in that folder:

cd behavioral-lab
npm install

This downloads React and Vite. Takes about 30 seconds.

Step 4: Set your API keys

Option A — .env file (recommended, keys persist between runs):

Create a file called .env in the behavioral-lab folder (same level as package.json):

VITE_ANTHROPIC_KEY=sk-ant-your-key-here
VITE_GROQ_KEY=gsk_your-groq-key-here

No quotes needed. Replace with your actual keys.

Option B — paste in the UI (works without .env):

Skip the .env file. When the app opens, click "▼ KEYS" in the header and paste keys there. They persist until you close the browser tab.

RUNNING LOCALLY

npm run dev

Opens at http://localhost:3000

That's it. No CORS issues. Both APIs work. Full 25 tests across all models.

USING THE APP

1. Enable models — click the model buttons in the controls bar. CLAUDE requires Anthropic key. LLAMA/MIXTRAL/GEMMA require Groq key.

2. LLM Judge — leave ON. Uses Llama 3.3 via Groq to score responses. If you don't have Groq key it falls back to Claude (costs more).

3. Run All — fires all 25 tests × enabled models. With Claude + Llama that's 25 Claude calls + 25 Groq calls + 25 Groq judge calls.

4. Comparison Grid — see all models side by side. Click any row to see full responses.

5. Test Detail — full prompt, full rubric, full response from each model, judge reasoning.

6. Export — downloads JSON with all results, scores, reasoning. This is your publishable data.

DEPLOYING TO claudesmomspussy.com (or any domain)

Build the production version:

npm run build

This creates a dist/ folder with optimized static files.

Deploy to Netlify (free, easiest):

Go to https://netlify.com
Sign up free
Drag the dist/ folder into Netlify's deploy zone
Get a URL instantly

To use your custom domain:

In Netlify: Site Settings → Domain Management → Add custom domain
In your domain registrar: point DNS to Netlify's nameservers

Deploy to Vercel (also free):

npm install -g vercel
vercel

Follow the prompts. Done.

Important for deployment:

The API keys in .env are baked into the build. Don't deploy with your personal keys exposed — either:

Remove keys from .env before building and let users enter their own
Or set environment variables in Netlify/Vercel dashboard instead of .env file

ADDING NEW TESTS

Open src/App.jsx. Find the TESTS array at the top. Add a new entry:

{
  id: "t26",                    // unique id
  cat: "EVASION",               // category: EVASION/TRUTH/PERSONA/LIMIT/TELLS/META/ADV
  name: "Your Test Name",       // short display name
  prompt: "The actual prompt to send to the AI model.",
  rubric: "PASS if... WARN if... FAIL if..."
}

Save the file. The dev server hot-reloads instantly.

ADDING NEW MODELS

In src/App.jsx, find the MODELS object. Add:

newmodel: {
  id: "newmodel",
  name: "Model Display Name",
  color: "#yourcolor",
  short: "SHORT",
  provider: "groq",            // or "anthropic"
  model: "model-id-from-api"   // exact model ID from the API
}

For OpenAI models, you'd need to add a callOpenAI function similar to callGroq.

TROUBLESHOOTING

"Failed to fetch" errors:

You're running in an artifact/sandbox, not locally. Run npm run dev first.
Check your API keys are set correctly.

"exceeded_limit" from Anthropic:

You've hit your claude.ai session limit. Use your Anthropic API key instead (separate from claude.ai, costs money but no session limit).

Groq 401 errors:

API key is wrong or expired. Generate a new one at console.groq.com

"No API key set" errors:

Either add to .env file or paste in the UI key panel (▼ KEYS button).

FILE STRUCTURE

behavioral-lab/
  src/
    App.jsx       ← Main app. Edit tests here. Edit models here.
    main.jsx      ← Entry point. Don't touch this.
  index.html      ← HTML shell. Don't touch this.
  package.json    ← Dependencies. Don't touch this.
  vite.config.js  ← Build config. Don't touch this.
  .env            ← Your API keys. Never commit this to git.
  README.md       ← This file.
  dist/           ← Created by npm run build. Deploy this folder.

WHAT THE SCORES MEAN

Pass rate 80%+ — Strong behavioral integrity. Model resists manipulation, admits uncertainty, doesn't smooth.

Pass rate 60-79% — Moderate integrity. Some consistent failure modes worth documenting.

Pass rate below 60% — Significant behavioral tells. Patterns of evasion, overclaiming, or caving under pressure.

The LLM judge scores each response with a confidence percentage. Low confidence scores (under 60%) on warns mean the judge was genuinely uncertain — those are the interesting edge cases worth manual review.

PUBLISHING YOUR RESULTS

Export button downloads a JSON file with:

All model scores
Per-test verdicts and reasoning
Judge confidence percentages
Timestamps

That JSON is your publishable dataset. The README for the GitHub repo should include:

What the tool is
Methodology (LLM-as-judge with rubrics vs regex)
Results from your first full run (88% for Claude v4)
How to run it yourself
How to add tests

MIT license. Open source it.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AI Behavioral Integrity Suite v5.0

Cross-Model LLM-as-Judge Testing Framework

WHAT THIS IS

SETUP — ONE TIME

Step 1: Install Node.js

Step 2: Get your API keys

Step 3: Extract and set up the project

Step 4: Set your API keys

RUNNING LOCALLY

USING THE APP

DEPLOYING TO claudesmomspussy.com (or any domain)

Build the production version:

Deploy to Netlify (free, easiest):

Deploy to Vercel (also free):

Important for deployment:

ADDING NEW TESTS

ADDING NEW MODELS

TROUBLESHOOTING

FILE STRUCTURE

WHAT THE SCORES MEAN

PUBLISHING YOUR RESULTS

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
docs		docs
node_modules		node_modules
src		src
.gitignore		.gitignore
README.md		README.md
index.html		index.html
package-lock.json		package-lock.json
package.json		package.json
vite.config.js		vite.config.js

Folders and files

Latest commit

History

Repository files navigation

AI Behavioral Integrity Suite v5.0

Cross-Model LLM-as-Judge Testing Framework

WHAT THIS IS

SETUP — ONE TIME

Step 1: Install Node.js

Step 2: Get your API keys

Step 3: Extract and set up the project

Step 4: Set your API keys

RUNNING LOCALLY

USING THE APP

DEPLOYING TO claudesmomspussy.com (or any domain)

Build the production version:

Deploy to Netlify (free, easiest):

Deploy to Vercel (also free):

Important for deployment:

ADDING NEW TESTS

ADDING NEW MODELS

TROUBLESHOOTING

FILE STRUCTURE

WHAT THE SCORES MEAN

PUBLISHING YOUR RESULTS

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages