Skip to content

feat(security): add Privacy Shield for PII masking in model context#12050

Closed
0xRaini wants to merge 9 commits intoopenclaw:mainfrom
0xRaini:feat/privacy-shield-pii-masking
Closed

feat(security): add Privacy Shield for PII masking in model context#12050
0xRaini wants to merge 9 commits intoopenclaw:mainfrom
0xRaini:feat/privacy-shield-pii-masking

Conversation

@0xRaini
Copy link
Contributor

@0xRaini 0xRaini commented Feb 8, 2026

Summary

Add a Privacy Shield layer that scans and redacts PII (Personally Identifiable Information) from the agent's outbound responses before they are sent to chat channels.

Motivation

This is specifically designed to combat Prompt Injection attacks. While the model needs access to sensitive info in its context to be helpful, we want to prevent that info from being "leaked" to external chat channels if the model is tricked or "hallucinates" private data into its output.

Features

  • Output Interception: Intercepts messages at the final delivery stage (e.g., Telegram, Discord).
  • Preserves Model Intelligence: Input context remains unscrubbed so the model stays smart.
  • Default PII Redaction: Automatically masks Emails, Phone numbers, Credit Card numbers, and IPv4 addresses.
  • Configurable: Users can enable/disable this feature via security.privacy.piiScrubbing.
  • Custom Patterns: Users can provide their own regex patterns for additional redaction.

Configuration Example

{
  "security": {
    "privacy": {
      "piiScrubbing": "on",
      "piiPatterns": ["\\bsecret-project-code\\b"]
    }
  }
}

Behavior Changes

Context (What Model Sees) Output (What is Sent to User)
"Call my boss at 123-456-7890" "Calling [PHONE_REDACTED]..."
"The secret code is XYZ-123" "The secret code is [REDACTED]"

Files Changed

  • src/infra/privacy.ts: Core PII scrubbing logic.
  • src/infra/outbound/deliver.ts: Integration into the outbound delivery pipeline.
  • src/config/types.security.ts: New security configuration types.
  • src/config/zod-schema.ts: Configuration schema validation.

AI-Assisted Contribution 🤖

  • AI-assisted (Claude)
  • Focused on Prompt Injection defense
  • Tested output scrubbing locally

lobster-biscuit

Greptile Overview

Greptile Summary

This PR introduces a “Privacy Shield” that redacts PII from agent context before sending it to LLMs. It adds a new security.privacy config section (types + Zod schema), implements PII scrubbing utilities in src/infra/privacy.ts, and integrates scrubbing into the embedded runner by applying it to the generated system prompt and to the limited message history before calling replaceMessages().

Key integration points are in src/agents/pi-embedded-runner/run/attempt.ts, where the system prompt override is scrubbed and the session history is scrubbed after validation/limiting.

Confidence Score: 3/5

  • This PR is close, but there is a type-safety regression at a core integration point that should be fixed before merging.
  • The config/type additions and integration look straightforward, but scrubPIIInMessages() returns unknown[] and is fed into replaceMessages(), which is intended to operate on AgentMessage[]. That’s either a TS compile break or a loss of guarantees in the message pipeline. Once the scrubber is typed to preserve AgentMessage shapes, the remaining changes appear low-risk.
  • src/agents/pi-embedded-runner/run/attempt.ts, src/infra/privacy.ts

@chatgpt-codex-connector
Copy link

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

@openclaw-barnacle openclaw-barnacle bot added agents Agent runtime and tooling and removed agents Agent runtime and tooling labels Feb 8, 2026
Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 file reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Feb 8, 2026

Additional Comments (1)

src/agents/pi-embedded-runner/run/attempt.ts
Type mismatch in history scrub
scrubPIIInMessages() is declared to return unknown[] (src/infra/privacy.ts:89), but its result is passed to activeSession.agent.replaceMessages(...) here. In this file, messages are typed as AgentMessage[] and replaceMessages expects AgentMessage[], so this call site either won’t type-check or will silently lose type guarantees (and could allow malformed message objects through).

Consider making scrubPIIInMessages accept/return AgentMessage[] (or a generic constrained to { content: ... }) so replaceMessages(scrubbed) stays type-safe.

Also appears in: src/infra/privacy.ts:89-124

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/agents/pi-embedded-runner/run/attempt.ts
Line: 560:568

Comment:
**Type mismatch in history scrub**
`scrubPIIInMessages()` is declared to return `unknown[]` (`src/infra/privacy.ts:89`), but its result is passed to `activeSession.agent.replaceMessages(...)` here. In this file, messages are typed as `AgentMessage[]` and `replaceMessages` expects `AgentMessage[]`, so this call site either won’t type-check or will silently lose type guarantees (and could allow malformed message objects through).

Consider making `scrubPIIInMessages` accept/return `AgentMessage[]` (or a generic constrained to `{ content: ... }`) so `replaceMessages(scrubbed)` stays type-safe.

Also appears in: src/infra/privacy.ts:89-124

How can I resolve this? If you propose a fix, please make it concise.

Rain added 9 commits February 11, 2026 10:32
Introduces a privacy protection layer that scans and redacts PII
(Personally Identifiable Information) before sending it to LLMs.

Features:
- Default redactors for Email, Phone, Credit Cards, and IPv4.
- Configurable via openclaw.json (security.privacy.piiScrubbing).
- Support for custom redaction patterns.
- Scrubs both System Prompt and message history.

Configuration:
{
  "security": {
    "privacy": {
      "piiScrubbing": "on",
      "piiPatterns": ["my-private-regex"]
    }
  }
}

lobster-biscuit
The string literal on line 318 used backslash-escaped quotes (\")
instead of normal quotes, causing tsgo, lint, format, and build
failures across the entire CI pipeline.
@0xRaini

This comment was marked as spam.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant