Skip to content

Redacted secret placeholders (***REDACTED***) get written back to user files via read-modify-write, corrupting config #1333

@nilact

Description

@nilact

What happens

When an agent edits a file that contains secrets (e.g. appsettings.json),
the redact-on-read behavior and the file-write path combine to persist the
redaction placeholder to disk, destroying the real values:

  1. Agent calls file_read on appsettings.json.
  2. SecretOutputRedactor replaces matching values with REDACTED before the content reaches the model (intended — see file_read default path materializes the whole file (OOM risk) and never redacts secrets #1301 / Plan: bound tool output with file spill (closes #1300, #1301) #1305 and the JSON-field patterns in the Secrets docs: password, secret, connection_string, access_token, client_secret, etc.).
  3. The model edits the file and writes it back (file_write, or a file_edit whose surrounding context was the redacted text).
  4. The redacted placeholders are now the on-disk values.

Net effect: a read-modify-write cycle where the model only ever saw
REDACTED, so that's what it writes back.

Concrete impact

In a .NET project, this overwrote appsettings.json:

// before
"Jwt": { "SecretKey": "super-secret-key-...-change-in-production" },
"ConnectionStrings": { "PostgreSQL": "Host=...;Password=password;..." },
"Redis": { "ConnectionString": "localhost:6379" },
"GoogleAuth": { "ClientSecret": "your-google-client-secret" }

// after the agent edited an unrelated section of the same file
"Jwt": { "SecretKey": "***REDACTED***" },
"ConnectionStrings": { "PostgreSQL": "Host=...;Password=***REDACTED***;..." },
"Redis": { "ConnectionString": "***REDACTED***" },
"GoogleAuth": { "ClientSecret": "***REDACTED***" }

The 14-char REDACTED JWT key is only 112 bits, below the HS256
128-bit minimum (IDX10653), so every auth-dependent code path and test
broke — and the failure surfaced far from the edit, making it hard to
diagnose. We hit this twice on unrelated tasks (a Google-OAuth task and a
URL-safety task) because both happened to touch the same file.

Why this is the inverse of #829 / #1301

Same class of bug as OpenClaw #11268 / #11355 (__OPENCLAW_REDACTED__written back into openclaw.json), but here it's user project files, not netclaw's own config.

Suggested directions

  1. Restore-on-write. Before committing a file_write/file_edit, if the new content contains the redaction sentinel, reconcile against the real
    on-disk bytes (restore unchanged secret values rather than overwriting with the placeholder). This mirrors the restoreRedactedValues() approach OpenClaw adopted.
  2. Refuse-and-warn. If a write would persist REDACTED, reject the write and tell the model the content still has redaction placeholders, so it re-reads with a bounded/secret-free path.
  3. Unique, detectable sentinel. Keep the placeholder distinctive enough that a write-side check has no false positives.

Environment

  • netclaw daemon (self-hosted), Discord channel, local OpenAI-compatible
    model (llama.cpp).
  • Target project: .NET 10 / ASP.NET Core, secrets in appsettings.json.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingconfigConfiguration issues, netclaw doctor, schema validation.

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions