Feature Request: Prompt Injection Scanning Config

**Feature:** Native prompt injection scanning configuration in `openclaw.json`

**Use case:** Autonomous agents need to filter untrusted inputs (web scrapes, third-party messages, skill outputs) for malicious prompt injections before processing.

**Proposed config:**
```javascript
{
  "security": {
    "promptInjection": {
      "enabled": true,
      "scanModel": "nvidia/meta/llama-guard-4-12b",
      "blockOnUnsafe": true,
      "logIncidents": true,
      "logPath": "~/.openclaw/security/prompt-injection.log"
    }
  }
}
```

**How it would work:**
1. Before passing user/web/message content to the LLM, run it through a content safety model (Llama Guard 4, Nemotron, etc.)
2. If unsafe/injection detected, either block (blockOnUnsafe: true) or log + proceed with warning
3. Log all scans to audit trail

**Why this matters:**
- OWASP Agentic AI Top 10 #A01 (Prompt Injection)
- Critical for production deployments that ingest untrusted content
- Currently requires manual implementation outside OpenClaw's request pipeline

**Workaround:**
We currently run Llama Guard externally in our WATCHTOWER security framework, but native integration would be more efficient and reliable.

**Related:** This came up while responding to Palo Alto Networks' claim that OpenClaw is insecure. Native security config would strengthen the platform's enterprise credibility.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature Request: Prompt Injection Scanning Config #7705

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Feature Request: Prompt Injection Scanning Config #7705

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions