Skip to content

Commit 406ae72

Browse files
committed
fix(logging): redact persisted transcript text
1 parent f99fb2a commit 406ae72

14 files changed

Lines changed: 177 additions & 31 deletions

CHANGELOG.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@ Docs: https://docs.openclaw.ai
66

77
### Fixes
88

9+
- Logging/sessions: apply configured redaction patterns to persisted session transcript text and accept escaped character classes in safe custom redaction regexes, so transcript JSONL no longer keeps matching sensitive text in the clear. Fixes #42982. Thanks @panpan0000.
910
- Auto-reply: poison inbound message dedupe after replay-unsafe provider/runtime failures so retries stay safe before visible progress but cannot duplicate messages after block output, tool side effects, or session progress. Fixes #69303; keeps #58549 and #64606 as duplicate validation. Thanks @martingarramon, @NikolaFC, and @zeroth-blip.
1011
- Agents/model fallback: jump directly to a known later live-session model redirect instead of walking unrelated fallback candidates, while preserving the already-landed live-session/fallback loop guard. Fixes #57471; related loop family already closed via #58496. Thanks @yuxiaoyang2007-prog.
1112
- Gateway/Bonjour: keep @homebridge/ciao cancellation handlers registered across advertiser restarts so late probing cancellations cannot crash Linux and other mDNS-churned gateways. Thanks @codex.
Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
7fa6e35bb9f9d3096d6281f141488be0dcfe15de40dc4f5c0305eb1ff2bc60b6 config-baseline.json
2-
5f5fb87fd46f9cbb84d8af17e00ae3c4b74062e8ad517bc2260ba83da2e9014f config-baseline.core.json
1+
4d1995e41b659e484afb5a48d6fca0558337123200a4a537f556ca38e8e829e7 config-baseline.json
2+
3245c9a013c55ee8a24db52d5e88c42bc86e26f822d4a144fc7f37fc71e05fa8 config-baseline.core.json
33
7cd9c908f066c143eab2a201efbc9640f483ab28bba92ddeca1d18cc2b528bc3 config-baseline.channel.json
44
f9e0174988718959fe1923a54496ec5b9262721fe1e7306f32ccb1316d9d9c3f config-baseline.plugin.json

docs/gateway/configuration-reference.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -859,6 +859,7 @@ Notes:
859859
- Set `logging.file` for a stable path.
860860
- `consoleLevel` bumps to `debug` when `--verbose`.
861861
- `maxFileBytes`: maximum active log file size in bytes before rotation (positive integer; default: `104857600` = 100 MB). OpenClaw keeps up to five numbered archives beside the active file.
862+
- `redactSensitive` / `redactPatterns`: best-effort masking for console output, file logs, OTLP log records, and persisted session transcript text.
862863

863864
---
864865

docs/gateway/logging.md

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -54,9 +54,10 @@ You can tune console verbosity independently via:
5454

5555
## Redaction
5656

57-
OpenClaw can mask sensitive tokens before log output leaves the process. The
58-
same redaction policy is applied at console and file-log sinks, so matching
59-
secret values are masked before JSONL lines are written to disk.
57+
OpenClaw can mask sensitive tokens before log or transcript output leaves the
58+
process. The same redaction policy is applied at console, file-log, OTLP
59+
log-record, and session transcript text sinks, so matching secret values are
60+
masked before JSONL lines or messages are written to disk.
6061

6162
- `logging.redactSensitive`: `off` | `tools` (default: `tools`)
6263
- `logging.redactPatterns`: array of regex strings (overrides defaults)

docs/gateway/security/index.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -999,7 +999,7 @@ Logs and transcripts can leak sensitive info even when access controls are corre
999999

10001000
Recommendations:
10011001

1002-
- Keep tool summary redaction on (`logging.redactSensitive: "tools"`; default).
1002+
- Keep log and transcript redaction on (`logging.redactSensitive: "tools"`; default).
10031003
- Add custom patterns for your environment via `logging.redactPatterns` (tokens, hostnames, internal URLs).
10041004
- When sharing diagnostics, prefer `openclaw status --all` (pasteable, secrets redacted) over raw logs.
10051005
- Prune old session transcripts and log files if you don’t need long retention.

docs/logging.md

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -167,14 +167,16 @@ file log levels.
167167

168168
### Redaction
169169

170-
Tool summaries can redact sensitive tokens before they hit the console:
170+
OpenClaw can redact sensitive tokens before they hit console output, file logs,
171+
OTLP log records, or persisted session transcript text:
171172

172173
- `logging.redactSensitive`: `off` | `tools` (default: `tools`)
173174
- `logging.redactPatterns`: list of regex strings to override the default set
174175

175-
Redaction applies at the logging sinks for **console output**, **stderr-routed
176-
console diagnostics**, and **file logs**. File logs stay JSONL, but matching
177-
secret values are masked before the line is written to disk.
176+
File logs and session transcripts stay JSONL, but matching secret values are
177+
masked before the line or message is written to disk. Redaction is best-effort:
178+
it applies to text-bearing message content and log strings, not every
179+
identifier or binary payload field.
178180

179181
## Diagnostics and OpenTelemetry
180182

src/agents/pi-embedded-runner.guard.test.ts

Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
import type { AgentMessage } from "@mariozechner/pi-agent-core";
22
import { SessionManager } from "@mariozechner/pi-coding-agent";
33
import { describe, expect, it } from "vitest";
4+
import type { OpenClawConfig } from "../config/types.openclaw.js";
45
import { guardSessionManager } from "./session-tool-result-guard-wrapper.js";
56
import { sanitizeToolUseResultPairing } from "./session-transcript-repair.js";
67

@@ -35,4 +36,46 @@ describe("guardSessionManager integration", () => {
3536
"assistant",
3637
]);
3738
});
39+
40+
it("redacts configured text patterns before persisting transcript messages", () => {
41+
const cfg = {
42+
logging: {
43+
redactSensitive: "tools",
44+
redactPatterns: [String.raw`([\w]|[-.])+@([\w]|[-.])+\.\w+`],
45+
},
46+
} satisfies OpenClawConfig;
47+
const sm = guardSessionManager(SessionManager.inMemory(), { config: cfg });
48+
const appendMessage = sm.appendMessage.bind(sm) as unknown as (message: AgentMessage) => void;
49+
50+
appendMessage({
51+
role: "assistant",
52+
content: [
53+
{ type: "thinking", thinking: "the email is peter@dc.io", thinkingSignature: "sig" },
54+
{ type: "text", text: "contact peter@dc.io" },
55+
{ type: "toolCall", id: "call_1", name: "read", arguments: { path: "/tmp/peter@dc.io" } },
56+
],
57+
stopReason: "toolUse",
58+
} as AgentMessage);
59+
appendMessage({
60+
role: "toolResult",
61+
toolCallId: "call_1",
62+
toolName: "read",
63+
content: [{ type: "text", text: "peter@dc.io\n" }],
64+
isError: false,
65+
} as AgentMessage);
66+
67+
const messages = sm
68+
.getEntries()
69+
.filter((e) => e.type === "message")
70+
.map((e) => (e as { message: AgentMessage }).message);
71+
const serialized = JSON.stringify(messages);
72+
73+
expect(serialized).not.toContain("the email is peter@dc.io");
74+
expect(serialized).not.toContain("contact peter@dc.io");
75+
expect(serialized).not.toContain("peter@dc.io\\n");
76+
expect(serialized).toContain('"thinking":"the email is peter@d***.io"');
77+
expect(serialized).toContain('"text":"contact peter@d***.io"');
78+
expect(serialized).toContain('"text":"peter@d***.io\\n"');
79+
expect(serialized).toContain('"/tmp/peter@dc.io"');
80+
});
3881
});

src/agents/session-tool-result-guard-wrapper.ts

Lines changed: 90 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
import type { AgentMessage } from "@mariozechner/pi-agent-core";
22
import type { SessionManager } from "@mariozechner/pi-coding-agent";
33
import type { OpenClawConfig } from "../config/types.openclaw.js";
4+
import { redactSensitiveText } from "../logging/redact.js";
45
import { getGlobalHookRunner } from "../plugins/hook-runner-global.js";
56
import {
67
applyInputProvenanceToUserMessage,
@@ -16,6 +17,71 @@ export type GuardedSessionManager = SessionManager & {
1617
clearPendingToolResults?: () => void;
1718
};
1819

20+
function redactTranscriptText(value: string, cfg?: OpenClawConfig): string {
21+
if (cfg?.logging?.redactSensitive === "off") {
22+
return value;
23+
}
24+
return redactSensitiveText(value, {
25+
mode: cfg?.logging?.redactSensitive,
26+
patterns: cfg?.logging?.redactPatterns,
27+
});
28+
}
29+
30+
function redactTranscriptContentBlock(block: unknown, cfg?: OpenClawConfig): unknown {
31+
if (!block || typeof block !== "object" || Array.isArray(block)) {
32+
return block;
33+
}
34+
const source = block as Record<string, unknown>;
35+
let next: Record<string, unknown> | null = null;
36+
const assign = (key: string, value: string) => {
37+
const redacted = redactTranscriptText(value, cfg);
38+
if (redacted === value) {
39+
return;
40+
}
41+
next ??= { ...source };
42+
next[key] = redacted;
43+
};
44+
45+
if (typeof source.text === "string") {
46+
assign("text", source.text);
47+
}
48+
if (typeof source.thinking === "string") {
49+
assign("thinking", source.thinking);
50+
}
51+
if (typeof source.partialJson === "string") {
52+
assign("partialJson", source.partialJson);
53+
}
54+
return next ?? block;
55+
}
56+
57+
function redactTranscriptContent(content: unknown, cfg?: OpenClawConfig): unknown {
58+
if (typeof content === "string") {
59+
return redactTranscriptText(content, cfg);
60+
}
61+
if (!Array.isArray(content)) {
62+
return content;
63+
}
64+
let changed = false;
65+
const redacted = content.map((block) => {
66+
const next = redactTranscriptContentBlock(block, cfg);
67+
changed ||= next !== block;
68+
return next;
69+
});
70+
return changed ? redacted : content;
71+
}
72+
73+
function redactTranscriptMessage(message: AgentMessage, cfg?: OpenClawConfig): AgentMessage {
74+
const source = message as unknown as Record<string, unknown>;
75+
const redactedContent = redactTranscriptContent(source.content, cfg);
76+
if (redactedContent === source.content) {
77+
return message;
78+
}
79+
return {
80+
...source,
81+
content: redactedContent,
82+
} as unknown as AgentMessage;
83+
}
84+
1985
/**
2086
* Apply the tool-result guard to a SessionManager exactly once and expose
2187
* a flush method on the instance for easy teardown handling.
@@ -38,14 +104,31 @@ export function guardSessionManager(
38104
}
39105

40106
const hookRunner = getGlobalHookRunner();
41-
const beforeMessageWrite = hookRunner?.hasHooks("before_message_write")
42-
? (event: { message: import("@mariozechner/pi-agent-core").AgentMessage }) => {
43-
return hookRunner.runBeforeMessageWrite(event, {
44-
agentId: opts?.agentId,
45-
sessionKey: opts?.sessionKey,
46-
});
107+
const beforeMessageWrite = (event: {
108+
message: import("@mariozechner/pi-agent-core").AgentMessage;
109+
}) => {
110+
let message = event.message;
111+
let changed = false;
112+
if (hookRunner?.hasHooks("before_message_write")) {
113+
const result = hookRunner.runBeforeMessageWrite(event, {
114+
agentId: opts?.agentId,
115+
sessionKey: opts?.sessionKey,
116+
});
117+
if (result?.block) {
118+
return result;
47119
}
48-
: undefined;
120+
if (result?.message) {
121+
message = result.message;
122+
changed = true;
123+
}
124+
}
125+
const redacted = redactTranscriptMessage(message, opts?.config);
126+
if (redacted !== message) {
127+
message = redacted;
128+
changed = true;
129+
}
130+
return changed ? { message } : undefined;
131+
};
49132

50133
const transform = hookRunner?.hasHooks("tool_result_persist")
51134
? (

src/config/schema.base.generated.ts

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -466,7 +466,7 @@ export const GENERATED_BASE_CONFIG_SCHEMA: BaseConfigSchemaResponse = {
466466
],
467467
title: "Sensitive Data Redaction Mode",
468468
description:
469-
'Sensitive redaction mode: "off" disables built-in masking, while "tools" redacts sensitive tool/config payload fields. Keep "tools" in shared logs unless you have isolated secure log sinks.',
469+
'Sensitive redaction mode: "off" disables built-in masking, while "tools" redacts sensitive tool/config payload fields in log sinks and persisted transcript text. Keep "tools" enabled unless logs and transcripts are isolated.',
470470
},
471471
redactPatterns: {
472472
type: "array",
@@ -475,7 +475,7 @@ export const GENERATED_BASE_CONFIG_SCHEMA: BaseConfigSchemaResponse = {
475475
},
476476
title: "Custom Redaction Patterns",
477477
description:
478-
"Additional custom redact regex patterns applied to log output before emission/storage. Use this to mask org-specific tokens and identifiers not covered by built-in redaction rules.",
478+
"Additional custom redact regex patterns applied to log output and persisted transcript text before storage. Use this to mask org-specific tokens and identifiers not covered by built-in redaction rules.",
479479
},
480480
},
481481
additionalProperties: false,
@@ -23982,12 +23982,12 @@ export const GENERATED_BASE_CONFIG_SCHEMA: BaseConfigSchemaResponse = {
2398223982
},
2398323983
"logging.redactSensitive": {
2398423984
label: "Sensitive Data Redaction Mode",
23985-
help: 'Sensitive redaction mode: "off" disables built-in masking, while "tools" redacts sensitive tool/config payload fields. Keep "tools" in shared logs unless you have isolated secure log sinks.',
23985+
help: 'Sensitive redaction mode: "off" disables built-in masking, while "tools" redacts sensitive tool/config payload fields in log sinks and persisted transcript text. Keep "tools" enabled unless logs and transcripts are isolated.',
2398623986
tags: ["privacy", "observability"],
2398723987
},
2398823988
"logging.redactPatterns": {
2398923989
label: "Custom Redaction Patterns",
23990-
help: "Additional custom redact regex patterns applied to log output before emission/storage. Use this to mask org-specific tokens and identifiers not covered by built-in redaction rules.",
23990+
help: "Additional custom redact regex patterns applied to log output and persisted transcript text before storage. Use this to mask org-specific tokens and identifiers not covered by built-in redaction rules.",
2399123991
tags: ["privacy", "observability"],
2399223992
},
2399323993
"cli.banner": {

src/config/schema.help.ts

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -43,9 +43,9 @@ export const FIELD_HELP: Record<string, string> = {
4343
"logging.consoleStyle":
4444
'Console output format style: "pretty", "compact", or "json" based on operator and ingestion needs. Use json for machine parsing pipelines and pretty/compact for human-first terminal workflows.',
4545
"logging.redactSensitive":
46-
'Sensitive redaction mode: "off" disables built-in masking, while "tools" redacts sensitive tool/config payload fields. Keep "tools" in shared logs unless you have isolated secure log sinks.',
46+
'Sensitive redaction mode: "off" disables built-in masking, while "tools" redacts sensitive tool/config payload fields in log sinks and persisted transcript text. Keep "tools" enabled unless logs and transcripts are isolated.',
4747
"logging.redactPatterns":
48-
"Additional custom redact regex patterns applied to log output before emission/storage. Use this to mask org-specific tokens and identifiers not covered by built-in redaction rules.",
48+
"Additional custom redact regex patterns applied to log output and persisted transcript text before storage. Use this to mask org-specific tokens and identifiers not covered by built-in redaction rules.",
4949
cli: "CLI presentation controls for local command output behavior such as banner and tagline style. Use this section to keep startup output aligned with operator preference without changing runtime behavior.",
5050
"cli.banner":
5151
"CLI startup banner controls for title/version line and tagline style behavior. Keep banner enabled for fast version/context checks, then tune tagline mode to your preferred noise level.",

0 commit comments

Comments
 (0)