You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When an agent edits a file that contains secrets (e.g. appsettings.json),
the redact-on-read behavior and the file-write path combine to persist the
redaction placeholder to disk, destroying the real values:
The model edits the file and writes it back (file_write, or a file_edit whose surrounding context was the redacted text).
The redacted placeholders are now the on-disk values.
Net effect: a read-modify-write cycle where the model only ever saw REDACTED, so that's what it writes back.
Concrete impact
In a .NET project, this overwrote appsettings.json:
// before"Jwt": { "SecretKey": "super-secret-key-...-change-in-production" },
"ConnectionStrings": { "PostgreSQL": "Host=...;Password=password;..." },
"Redis": { "ConnectionString": "localhost:6379" },
"GoogleAuth": { "ClientSecret": "your-google-client-secret" }
// after the agent edited an unrelated section of the same file"Jwt": { "SecretKey": "***REDACTED***" },
"ConnectionStrings": { "PostgreSQL": "Host=...;Password=***REDACTED***;..." },
"Redis": { "ConnectionString": "***REDACTED***" },
"GoogleAuth": { "ClientSecret": "***REDACTED***" }
The 14-char REDACTED JWT key is only 112 bits, below the HS256
128-bit minimum (IDX10653), so every auth-dependent code path and test
broke — and the failure surfaced far from the edit, making it hard to
diagnose. We hit this twice on unrelated tasks (a Google-OAuth task and a
URL-safety task) because both happened to touch the same file.
The missing piece is a write-side guard: when content the agent is about to write still contains REDACTED, it almost certainly came
from redacted read output and must not be persisted.
Same class of bug as OpenClaw #11268 / #11355 (__OPENCLAW_REDACTED__written back into openclaw.json), but here it's user project files, not netclaw's own config.
Suggested directions
Restore-on-write. Before committing a file_write/file_edit, if the new content contains the redaction sentinel, reconcile against the real
on-disk bytes (restore unchanged secret values rather than overwriting with the placeholder). This mirrors the restoreRedactedValues() approach OpenClaw adopted.
Refuse-and-warn. If a write would persist REDACTED, reject the write and tell the model the content still has redaction placeholders, so it re-reads with a bounded/secret-free path.
Unique, detectable sentinel. Keep the placeholder distinctive enough that a write-side check has no false positives.
Environment
netclaw daemon (self-hosted), Discord channel, local OpenAI-compatible
model (llama.cpp).
Target project: .NET 10 / ASP.NET Core, secrets in appsettings.json.
What happens
When an agent edits a file that contains secrets (e.g. appsettings.json),
the redact-on-read behavior and the file-write path combine to persist the
redaction placeholder to disk, destroying the real values:
Net effect: a read-modify-write cycle where the model only ever saw
REDACTED, so that's what it writes back.
Concrete impact
In a .NET project, this overwrote appsettings.json:
The 14-char REDACTED JWT key is only 112 bits, below the HS256
128-bit minimum (IDX10653), so every auth-dependent code path and test
broke — and the failure surfaced far from the edit, making it hard to
diagnose. We hit this twice on unrelated tasks (a Google-OAuth task and a
URL-safety task) because both happened to touch the same file.
Why this is the inverse of #829 / #1301
from redacted read output and must not be persisted.
Same class of bug as OpenClaw #11268 / #11355 (__OPENCLAW_REDACTED__written back into openclaw.json), but here it's user project files, not netclaw's own config.
Suggested directions
on-disk bytes (restore unchanged secret values rather than overwriting with the placeholder). This mirrors the restoreRedactedValues() approach OpenClaw adopted.
Environment
model (llama.cpp).