PromptShield

Shift-left security for AI agents — catch it in the PR, not in the postmortem.

PromptShield is a proactive shift-left security scanner for LLM apps and agents. It flags prompt injection and other agent-specific risks before they ship by using static rules, AST-based taint/dataflow analysis, and Gemini-assisted review when enabled. Results are mapped to CWE and OWASP LLM Top 10, with evidence, remediation guidance, and confidence scores.

Capabilities

Area	What you get
Detection	Static rules + AST taint/dataflow to dangerous sinks; Gemini review when enabled; similarity matches to known jailbreak patterns
Risk	Single 0–100 score from merged findings; per-finding severity and category
PR workflow	GitHub App: scan changed files, inline comments, check runs, configurable risk gate
Data	MongoDB Atlas + Search/Vector Search: scans, corpora, similarity matches, benchmarks, tool registry, alerts
Governance	Policy-as-code (`.promptshield.yml`), suppressions, CSV/PDF exports, audit-oriented dashboards
Ops	Per-IP rate limits, structured JSON logs with `request_id`, role-based auth (admin / PM / viewer)

Stack

Layer	Technology
API	FastAPI (`backend/`), Uvicorn
Database	MongoDB Atlas (Motor) with Atlas Search/Vector Search; mongomock when `MONGODB_URI` is unset (local/CI)
AI	Google Vertex AI + Gemini via `google-genai`; auth via service account JSON (`GOOGLE_APPLICATION_CREDENTIALS`)
Embeddings	MongoDB Voyage AI or local sentence-transformers (configurable)
Frontend	React 18, Vite, Tailwind
Deploy	Root `requirements.txt` matches `backend/requirements.txt` for Railway/Nixpacks-style builds; `nixpacks.toml` starts the API from `backend/`

MongoDB Atlas

MongoDB Atlas is the backbone of PromptShield’s data layer. We use it as more than a database: it powers Atlas Search (keyword + semantic), vector similarity for “similar jailbreak” matches, and hybrid retrieval/ranking across scans and corpora. Atlas also supports change streams for live updates, analytics-friendly collections, and agent-focused datasets (tool registry, exploit corpus, alerts).

To enable these features, apply the Search/Vector index definitions under backend/scripts/atlas_indexes/ (via Atlas UI or mongosh, as noted in the scripts).

Minimal setup: create a cluster → database user → network access → set MONGODB_URI and PRIMARY_STORE=mongo in backend/.env → run:

python backend/scripts/setup_atlas_indexes.py

If you want database-native automation, install Atlas Triggers in backend/scripts/atlas_triggers/ (e.g. redaction on insert, critical agent alerts). Where triggers aren’t installed, Python-side fallbacks still run when implemented.

Run locally

Backend (port 8000)

cd backend
python -m venv .venv && source .venv/bin/activate   # Windows: .venv\Scripts\activate
pip install -r requirements.txt
cp .env.example .env   # edit as needed; see below
uvicorn main:app --reload --port 8000

Vertex AI (Gemini): Request a service account JSON key from your project admin (Vertex AI User or broader roles as your org allows). Save it as e.g. backend/google-credentials.json and point GOOGLE_APPLICATION_CREDENTIALS at that path in backend/.env (see .env.example). Do not commit the JSON file—it is gitignored. With GOOGLE_CLOUD_PROJECT set and GOOGLE_GENAI_USE_VERTEXAI=true, the app uses Vertex only (no Gemini API key env vars).

Without GOOGLE_CLOUD_PROJECT, the Gemini layer is off. PROMPTSHIELD_SCAN_MODE=fast skips Gemini and vector enrichment on /api/scan even when a project is set (see .env.example).
Without MONGODB_URI, storage uses mongomock for development.

Auth: There is no public signup. If BOOTSTRAP_ADMIN_EMAIL and BOOTSTRAP_ADMIN_PASSWORD are set and the users collection is empty, an admin is created on startup. Use the app’s sign-in page or POST /api/auth/login.

cd backend && pytest

Frontend (port 5173)

cd frontend
npm install
npm run dev

Vite proxies /api and WebSocket paths to http://localhost:8000. Open http://localhost:5173.

GitHub App (PR scanning)

Create a GitHub App at github.com/settings/apps/new: webhook https://<your-host>/api/github/webhook, permissions for pull requests and checks (read/write as required), subscribe to pull request events.
Install the app on repositories and add the private key.
Set in backend/.env (see .env.example): GITHUB_APP_ID, GITHUB_APP_PRIVATE_KEY or GITHUB_APP_PRIVATE_KEY_PATH, GITHUB_WEBHOOK_SECRET, plus DASHBOARD_BASE_URL and optional RISK_GATE_THRESHOLD.

For local webhooks, expose port 8000 (e.g. backend/scripts/ngrok_http_8000.sh) and point the App’s webhook URL at /api/github/webhook.

Confirm GET /api/health reports GitHub App configuration when env is correct.

Configuration

All important variables are documented in backend/.env.example. Typical groups:

Vertex / Gemini: GOOGLE_CLOUD_PROJECT, GOOGLE_APPLICATION_CREDENTIALS (path to service account JSON), GOOGLE_CLOUD_LOCATION, GEMINI_MODEL, GOOGLE_GENAI_USE_VERTEXAI=true
MongoDB: MONGODB_URI, MONGODB_DB, PRIMARY_STORE
Embeddings: EMBEDDING_PROVIDER, VOYAGE_API_KEY (if using Voyage), dimensions aligned with Atlas indexes
HTTP: ALLOWED_ORIGINS, SCAN_RATE_LIMIT, SCAN_RATE_WINDOW, MAX_INPUT_CHARS, REDACT_PERSISTED_INPUT, LOG_LEVEL

API overview

Interactive docs: http://localhost:8000/docs (OpenAPI/Swagger) when the server is running.

Rough map:

/api/scan — main scan (static + dataflow + AI/vector path when enabled)
/api/scans, /api/dashboard/*, /api/reports/* — history, dashboards, exports
/api/v2/* — Atlas-backed search, similar scans, hybrid search, benchmarks, live scans (/api/live/scans), agent tools/alerts/timeline
/api/github/* — webhook and sync
/api/auth/*, /api/policy/*, /api/suppressions/* — sessions, policy validation, suppressions

Responses include an x-request-id header aligned with structured logs.

Detection and agentic findings

Static layer covers categories such as direct injection, secrets in prompts, system prompt exposure, role confusion, over-permissive behavior, data leakage, and indirect injection—each tied to CWE and OWASP LLM labels where applicable.

Agent tools: Functions exposed to the LLM (e.g. @tool, LangChain-style tool lists) are analyzed as privileged entry points; arguments are traced into sinks such as subprocess, eval, SQL execution, or unsafe HTML. Representative finding types include dangerous capabilities, unvalidated arguments, excessive scope, and taint from tool parameters or LLM output into execution, shell, SQL, or XSS-prone sinks.

Examples: GET /api/examples lists demo files (e.g. vulnerable agent, unsafe output, unsafe RAG) you can load in the UI or inspect under backend/examples/.

Benchmarks

backend/three_layer_benchmark.py evaluates labeled prompts in backend/prompts.json against static rules, the optional ML classifier artifact, and the live POST /api/scan API. Results can be recorded via the benchmark API and Mongo collections when Atlas is configured. Run with the API up on port 8000 as documented in the script.

Positioning

PromptShield is meant to shift security left on LLM and agent features: catch risky changes in review before merge. It complements runtime guardrails (e.g. in-production monitoring) by reducing what reaches production. The project is self-hostable and intended for internal SDLC workflows.

Name		Name	Last commit message	Last commit date
Latest commit History 52 Commits
backend		backend
frontend		frontend
.gitignore		.gitignore
README.md		README.md
nixpacks.toml		nixpacks.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PromptShield

Capabilities

Stack

MongoDB Atlas

Run locally

Backend (port 8000)

Frontend (port 5173)

GitHub App (PR scanning)

Configuration

API overview

Detection and agentic findings

Benchmarks

Positioning

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

PromptShield

Capabilities

Stack

MongoDB Atlas

Run locally

Backend (port 8000)

Frontend (port 5173)

GitHub App (PR scanning)

Configuration

API overview

Detection and agentic findings

Benchmarks

Positioning

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages