feat(security): add Privacy Shield for PII masking in model context by 0xRaini · Pull Request #12050 · openclaw/openclaw

0xRaini · 2026-02-08T19:30:24Z

Summary

Add a Privacy Shield layer that scans and redacts PII (Personally Identifiable Information) from the agent's outbound responses before they are sent to chat channels.

Motivation

This is specifically designed to combat Prompt Injection attacks. While the model needs access to sensitive info in its context to be helpful, we want to prevent that info from being "leaked" to external chat channels if the model is tricked or "hallucinates" private data into its output.

Features

Output Interception: Intercepts messages at the final delivery stage (e.g., Telegram, Discord).
Preserves Model Intelligence: Input context remains unscrubbed so the model stays smart.
Default PII Redaction: Automatically masks Emails, Phone numbers, Credit Card numbers, and IPv4 addresses.
Configurable: Users can enable/disable this feature via security.privacy.piiScrubbing.
Custom Patterns: Users can provide their own regex patterns for additional redaction.

Configuration Example

{
  "security": {
    "privacy": {
      "piiScrubbing": "on",
      "piiPatterns": ["\\bsecret-project-code\\b"]
    }
  }
}

Behavior Changes

Context (What Model Sees)	Output (What is Sent to User)
"Call my boss at 123-456-7890"	"Calling [PHONE_REDACTED]..."
"The secret code is XYZ-123"	"The secret code is [REDACTED]"

Files Changed

src/infra/privacy.ts: Core PII scrubbing logic.
src/infra/outbound/deliver.ts: Integration into the outbound delivery pipeline.
src/config/types.security.ts: New security configuration types.
src/config/zod-schema.ts: Configuration schema validation.

AI-Assisted Contribution 🤖

AI-assisted (Claude)
Focused on Prompt Injection defense
Tested output scrubbing locally

lobster-biscuit

Greptile Overview

Greptile Summary

This PR introduces a “Privacy Shield” that redacts PII from agent context before sending it to LLMs. It adds a new security.privacy config section (types + Zod schema), implements PII scrubbing utilities in src/infra/privacy.ts, and integrates scrubbing into the embedded runner by applying it to the generated system prompt and to the limited message history before calling replaceMessages().

Key integration points are in src/agents/pi-embedded-runner/run/attempt.ts, where the system prompt override is scrubbed and the session history is scrubbed after validation/limiting.

Confidence Score: 3/5

This PR is close, but there is a type-safety regression at a core integration point that should be fixed before merging.
The config/type additions and integration look straightforward, but scrubPIIInMessages() returns unknown[] and is fed into replaceMessages(), which is intended to operate on AgentMessage[]. That’s either a TS compile break or a loss of guarantees in the message pipeline. Once the scrubber is typed to preserve AgentMessage shapes, the remaining changes appear low-risk.
src/agents/pi-embedded-runner/run/attempt.ts, src/infra/privacy.ts

chatgpt-codex-connector · 2026-02-08T19:30:32Z

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

greptile-apps

_{1 file reviewed, 1 comment}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps · 2026-02-08T19:32:54Z

Additional Comments (1)

src/agents/pi-embedded-runner/run/attempt.ts
Type mismatch in history scrub
scrubPIIInMessages() is declared to return unknown[] (src/infra/privacy.ts:89), but its result is passed to activeSession.agent.replaceMessages(...) here. In this file, messages are typed as AgentMessage[] and replaceMessages expects AgentMessage[], so this call site either won’t type-check or will silently lose type guarantees (and could allow malformed message objects through).

Consider making scrubPIIInMessages accept/return AgentMessage[] (or a generic constrained to { content: ... }) so replaceMessages(scrubbed) stays type-safe.

Also appears in: src/infra/privacy.ts:89-124

Prompt To Fix With AI

This is a comment left during a code review.
Path: src/agents/pi-embedded-runner/run/attempt.ts
Line: 560:568

Comment:
**Type mismatch in history scrub**
`scrubPIIInMessages()` is declared to return `unknown[]` (`src/infra/privacy.ts:89`), but its result is passed to `activeSession.agent.replaceMessages(...)` here. In this file, messages are typed as `AgentMessage[]` and `replaceMessages` expects `AgentMessage[]`, so this call site either won’t type-check or will silently lose type guarantees (and could allow malformed message objects through).

Consider making `scrubPIIInMessages` accept/return `AgentMessage[]` (or a generic constrained to `{ content: ... }`) so `replaceMessages(scrubbed)` stays type-safe.

Also appears in: src/infra/privacy.ts:89-124

How can I resolve this? If you propose a fix, please make it concise.

Introduces a privacy protection layer that scans and redacts PII (Personally Identifiable Information) before sending it to LLMs. Features: - Default redactors for Email, Phone, Credit Cards, and IPv4. - Configurable via openclaw.json (security.privacy.piiScrubbing). - Support for custom redaction patterns. - Scrubs both System Prompt and message history. Configuration: { "security": { "privacy": { "piiScrubbing": "on", "piiPatterns": ["my-private-regex"] } } } lobster-biscuit

The string literal on line 318 used backslash-escaped quotes (\") instead of normal quotes, causing tsgo, lint, format, and build failures across the entire CI pipeline.

…IInMessages

openclaw-barnacle bot added agents Agent runtime and tooling and removed agents Agent runtime and tooling labels Feb 8, 2026

greptile-apps bot reviewed Feb 8, 2026

View reviewed changes

Reapor-Yurnero mentioned this pull request Feb 9, 2026

feat(gateway): support modular guardrails extensions for securing against indirect prompt injections and other agentic threats #6095

Closed

openclaw-barnacle bot added the agents Agent runtime and tooling label Feb 9, 2026

Rain added 9 commits February 11, 2026 10:32

fix: use generic for scrubPIIInMessages and call it in attempt.ts

4711f66

fix(privacy): fix escaped quotes in deliver.ts causing parse errors

eec3305

The string literal on line 318 used backslash-escaped quotes (\") instead of normal quotes, causing tsgo, lint, format, and build failures across the entire CI pipeline.

ci: retrigger checks

85a949a

style: fix import sort order in deliver.ts

c7d605c

ci: retrigger checks

059426b

fix(privacy): relax type constraint in scrubPIIInMessages to fix build

236d438

fix(privacy): use Record<string, unknown> instead of any for linter

9b8e6bc

fix(privacy): use generic T instead of content constraint for scrubPI…

6ad2488

…IInMessages

This comment was marked as spam.

Sign in to view

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(security): add Privacy Shield for PII masking in model context#12050

feat(security): add Privacy Shield for PII masking in model context#12050
0xRaini wants to merge 9 commits intoopenclaw:mainfrom
0xRaini:feat/privacy-shield-pii-masking

0xRaini commented Feb 8, 2026 •

edited by greptile-apps bot

Loading

Uh oh!

chatgpt-codex-connector bot commented Feb 8, 2026

Uh oh!

greptile-apps bot left a comment

Uh oh!

greptile-apps bot commented Feb 8, 2026

Uh oh!

This comment was marked as spam.

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

0xRaini commented Feb 8, 2026 • edited by greptile-apps bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Motivation

Features

Configuration Example

Behavior Changes

Files Changed

AI-Assisted Contribution 🤖

Greptile Overview

Greptile Summary

Confidence Score: 3/5

Uh oh!

chatgpt-codex-connector bot commented Feb 8, 2026

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot commented Feb 8, 2026

Uh oh!

This comment was marked as spam.

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

0xRaini commented Feb 8, 2026 •

edited by greptile-apps bot

Loading