Skip to content

feat(cli): redact secrets from interactive history; add /privacy controls#1387

Merged
muddlebee merged 11 commits into
mainfrom
feat/1377-history-privacy
May 8, 2026
Merged

feat(cli): redact secrets from interactive history; add /privacy controls#1387
muddlebee merged 11 commits into
mainfrom
feat/1377-history-privacy

Conversation

@yashksaini-coder

Copy link
Copy Markdown
Collaborator

Closes #1377.

Summary

The interactive shell prompt persists every typed line to ~/.config/opensre/interactive_history. Incident prompts can include tokens (AWS keys, GitHub PATs, JWTs, Bearer headers, password CLI args, PEM keys, etc.), so this PR:

  • redacts known token shapes before each entry is written to disk
  • adds /history clear, /history off|on, /history retention <N> subcommands
  • adds a new /privacy slash command that shows persistence + redaction state, retention cap, file path, and a one-line threat model
  • caps history at 5000 entries by default (oldest pruned), configurable via env or config file
  • documents the controls and threat model in docs/interactive-shell-privacy.mdx

Defaults: redaction on, persistence on, retention cap 5000. All three are overridable via OPENSRE_HISTORY_* env vars or the interactive.history block in ~/.config/opensre/config.yml.

What was changed

  • New app/cli/interactive_shell/history_policy.pyHistoryPolicy, built-in RedactionRule set, RedactingFileHistory (subclass of prompt_toolkit.FileHistory overriding store_string)
  • app/cli/interactive_shell/history.py — selects InMemoryHistory / RedactingFileHistory / raw FileHistory based on policy; adds clear_persisted_history() helper
  • app/cli/interactive_shell/commands.py_cmd_history becomes a subcommand dispatcher, new _cmd_privacy, /privacy registered in SLASH_COMMANDS
  • app/cli/interactive_shell/session.py — exposes the live prompt_history_backend so commands can flip the paused flag at runtime
  • app/cli/interactive_shell/loop.py — passes session into _build_prompt_session and stores the backend reference
  • app/cli/interactive_shell/config.pyread_history_settings() for the config-file tier
  • New docs/interactive-shell-privacy.mdx (linked from docs/docs.json integration overview group)

Test plan

  • make test-cov — 5219 passed, 3 skipped (no regressions)
  • make lint — clean
  • make format-check — clean
  • make typecheck — clean for new code (one pre-existing sentry_sdk import error on main is unrelated)
  • 36 new tests across redaction patterns, retention pruning, env/config resolution, and slash command behavior

@mintlify

mintlify Bot commented May 6, 2026

Copy link
Copy Markdown

Preview deployment for your docs. Learn more about Mintlify Previews.

Project Status Preview Updated (UTC)
tracer 🟢 Ready View Preview May 6, 2026, 8:05 AM

💡 Tip: Enable Workflows to automatically generate PRs for you.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@yashksaini-coder yashksaini-coder marked this pull request as ready for review May 7, 2026 07:30
Copilot AI review requested due to automatic review settings May 7, 2026 07:30
@greptile-apps

greptile-apps Bot commented May 7, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR adds secret-redaction to the interactive shell history file, controlled by HistoryPolicy, and surfaces /history subcommands and a new /privacy command so users can inspect and manage persistence at runtime.

  • history_policy.py introduces RedactingFileHistory (subclass of prompt_toolkit.FileHistory) with 12 built-in redaction patterns, a retention cap with auto-prune, and a paused flag toggled by /history off|on.
  • privacy_cmds.py dispatches /history clear|off|on|retention and /privacy, distinguishing RedactingFileHistory, raw FileHistory, and InMemoryHistory backends to give accurate status messages.
  • session.py and loop.py expose the live history backend on ReplSession so slash commands can mutate it at runtime; config.py adds a read_history_settings() helper for the config-file tier.

Confidence Score: 5/5

Safe to merge; the redaction engine and command dispatch are correct with comprehensive test coverage.

The implementation is solid end-to-end: policy resolution, file pruning, pause/resume, and all three backend types are handled correctly. The bearer pattern's character class omits + and /, so a standard-base64 opaque token containing + near its start could reach disk unredacted, but this is a narrow hardening gap rather than a widespread regression.

history_policy.py — bearer pattern character class

Security Review

  • bearer pattern in history_policy.py uses [A-Za-z0-9_\-\.] — excludes + and / found in standard base64 opaque tokens; a token whose value begins with fewer than 20 consecutive safe characters won't be matched and writes to disk unredacted.

Important Files Changed

Filename Overview
app/cli/interactive_shell/history_policy.py Core redaction engine and RedactingFileHistory; well-structured with good guards. Bearer token regex excludes +// (standard base64 chars), risking incomplete redaction for some opaque tokens.
app/cli/interactive_shell/command_registry/privacy_cmds.py All previously flagged issues (wrong messages for FileHistory, isinstance fragility, retention=0 crash) appear resolved. Logic for /history and /privacy commands is correct.
app/cli/interactive_shell/history.py Policy-based history selection is clean. clear_persisted_history leaves the live backend's _entry_count stale, causing one spurious prune on the next write after a clear.
app/cli/interactive_shell/session.py Minimal change: adds prompt_history_backend field with correct TYPE_CHECKING guard.
app/cli/interactive_shell/loop.py Single-line change stores the prompt_toolkit History reference on the session; correct and safe.
tests/cli/interactive_shell/test_history_policy.py Comprehensive parametric tests covering all redaction patterns, retention pruning, env/config resolution, and edge cases like zero-cap and multi-line PEM blocks.
tests/cli/interactive_shell/test_history_commands.py Good coverage of all /history subcommands and /privacy across all three backend types.

Sequence Diagram

sequenceDiagram
    participant User
    participant PromptSession
    participant RedactingFileHistory
    participant HistoryFile as ~/.config/opensre/interactive_history

    User->>PromptSession: types command
    PromptSession->>RedactingFileHistory: store_string(raw)
    alt "paused == True"
        RedactingFileHistory-->>PromptSession: return (no-op)
    else "paused == False"
        RedactingFileHistory->>RedactingFileHistory: redact_text(raw, rules)
        RedactingFileHistory->>HistoryFile: write redacted entry
        RedactingFileHistory->>RedactingFileHistory: increment _entry_count
        alt "_entry_count > _max_entries"
            RedactingFileHistory->>HistoryFile: read + prune oldest entries
            RedactingFileHistory->>HistoryFile: write pruned content
        end
    end

    User->>PromptSession: /history off
    PromptSession->>RedactingFileHistory: "backend.paused = True"

    User->>PromptSession: /history clear
    PromptSession->>HistoryFile: write_text("")

    User->>PromptSession: /privacy
    PromptSession->>RedactingFileHistory: read paused, max_entries
    PromptSession-->>User: show persistence/redaction/retention table
Loading

Reviews (4): Last reviewed commit: "fix(cli): use set_max_entries for /histo..." | Re-trigger Greptile

Comment thread app/cli/interactive_shell/command_registry/privacy_cmds.py
Comment thread app/cli/interactive_shell/command_registry/privacy_cmds.py
Comment thread app/cli/interactive_shell/command_registry/privacy_cmds.py Outdated
Comment thread app/cli/interactive_shell/history_policy.py Outdated

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds privacy hardening for the interactive shell’s persisted command history by introducing configurable secret redaction, retention controls, and new /history and /privacy UX, plus documentation and tests to support the new behavior (closes #1377).

Changes:

  • Introduce HistoryPolicy + RedactingFileHistory to redact known secret/token shapes before writing history to disk and to enforce a retention cap.
  • Add /history clear|off|on|retention <N> and /privacy commands, and wire session state so commands can toggle persistence at runtime.
  • Document defaults/threat model and add test coverage for redaction, retention, env/config resolution, and command behavior.

Reviewed changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
app/cli/interactive_shell/history_policy.py New policy + redacting/persisting history backend with retention pruning.
app/cli/interactive_shell/history.py Policy-aware history backend selection; add helper to clear persisted history.
app/cli/interactive_shell/command_registry/privacy_cmds.py New /history subcommands and /privacy command implementation.
app/cli/interactive_shell/command_registry/init.py Registers the new privacy/history commands into the slash-command registry.
app/cli/interactive_shell/session.py Stores live prompt history backend on the session for runtime mutation by commands.
app/cli/interactive_shell/loop.py Passes ReplSession into prompt construction and persists backend reference.
app/cli/interactive_shell/config.py Reads interactive.history config block for history policy resolution.
tests/cli/interactive_shell/test_history_policy.py Tests redaction patterns, paused behavior, retention pruning, and env/file resolution.
tests/cli/interactive_shell/test_history_commands.py Tests /history subcommands and /privacy output across backends.
tests/cli/interactive_shell/test_loop.py Updates prompt session builder tests for the new session parameter/backend reference.
docs/interactive-shell-privacy.mdx New documentation for defaults, controls, and threat model.
docs/docs.json Adds the new privacy doc page to the docs navigation.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread app/cli/interactive_shell/command_registry/privacy_cmds.py Outdated
Comment thread app/cli/interactive_shell/command_registry/privacy_cmds.py
Comment thread app/cli/interactive_shell/command_registry/privacy_cmds.py Outdated
Comment thread app/cli/interactive_shell/history_policy.py
Comment thread app/cli/interactive_shell/history_policy.py
Comment thread app/cli/interactive_shell/history_policy.py
Comment thread app/cli/interactive_shell/command_registry/privacy_cmds.py Outdated
yashksaini-coder and others added 2 commits May 7, 2026 15:52
@yashksaini-coder

Copy link
Copy Markdown
Collaborator Author

Addressed the Greptile and Copilot review items in 5cb58c4:

  • fixed /history retention 0 so unlimited retention is safe and no longer crashes
  • added a public set_max_entries() path plus safe zero-cap pruning
  • corrected /history on|off messaging for raw FileHistory backends
  • switched in-memory detection to isinstance(..., InMemoryHistory)
  • parsed quoted config booleans like "false" / "0" correctly
  • initialized paused per instance and reduced pruning overhead by tracking entry counts
  • fixed the Mintlify spellcheck issue (plain text)

Also merged the latest main into this branch and reran formatting, lint, and the full test suite.


def _build_default_rules() -> tuple[RedactionRule, ...]:
raw: list[tuple[str, str, str]] = [
("aws_key", r"(?:AKIA|ASIA)[A-Z0-9]{16}", "[REDACTED:aws_key]"),

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems very limited redacted key-rules? we should scan the entire project what other keys are used or patterns which we can add..

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yashksaini-coder take a look


def _build_default_rules() -> tuple[RedactionRule, ...]:
raw: list[tuple[str, str, str]] = [
("aws_key", r"(?:AKIA|ASIA)[A-Z0-9]{16}", "[REDACTED:aws_key]"),

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also move this to separate constants file

@muddlebee

Copy link
Copy Markdown
Collaborator

@greptileai review

@muddlebee

Copy link
Copy Markdown
Collaborator

rest the greptile score needs to be improved to 5/5 and pls add a demo :)

Comment thread app/cli/interactive_shell/history_policy.py
Address Greptile review feedback to lift the score to 5/5:

- P1 #1 (security): the private_key redaction pattern previously matched
  only the BEGIN header line, so pasted multi-line PEM blocks leaked the
  body and END footer to disk. Pattern now spans header through footer
  with a non-greedy body match. Test added that pastes a full RSA, EC,
  and OPENSSH PEM block and asserts both the base64 body and END marker
  are gone after redaction (single-string and history-store paths).

- P1 #2: /history retention 0 now sets the cap by writing
  backend._max_entries directly and only calls _prune_to_cap when n > 0,
  matching the same guard that store_string already uses. Existing
  test_zero_sets_unlimited_without_crashing still passes.

- P1 #3: /history off and /history on already distinguish the three
  backend cases (RedactingFileHistory, plain FileHistory, InMemoryHistory)
  with accurate messages, and dedicated tests cover the FileHistory
  branch for both subcommands.

- P2 #1: backend identity check uses isinstance(backend, InMemoryHistory)
  instead of fragile __class__.__name__ string matching.

- P2 #2: paused is initialized in __init__ with an explicit bool
  annotation rather than as a class-level annotation.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@yashksaini-coder

Copy link
Copy Markdown
Collaborator Author

Addressed all five Greptile inline items in 9145e4c:

  • P1 Codebase refactoring and cleanup  #1 (security)app/cli/interactive_shell/history_policy.py: private_key redaction pattern now spans the full PEM block (BEGIN header through END footer) with a non-greedy body match, instead of only the BEGIN header line. Added test_full_pem_block_is_redacted_including_body_and_footer, test_openssh_pem_block_is_redacted_end_to_end, and test_pem_block_inside_history_entry_does_not_leak_to_disk to assert the base64 body and END marker no longer leak to disk.
  • P1 Revert "Codebase refactoring and cleanup " #2privacy_cmds.py::_history_retention: now writes backend._max_entries = n directly and only calls backend._prune_to_cap() when n > 0, mirroring the if self._max_entries > 0 guard already in store_string. Existing test_zero_sets_unlimited_without_crashing continues to pass.
  • P1 Feature/langsmith integration #3privacy_cmds.py::_history_pause: distinguishes the three backend cases (RedactingFileHistory, plain FileHistory, InMemoryHistory/None) with separate branches and accurate messages. Plain-FileHistory branch is covered by test_file_history_backend_reports_runtime_pause_is_unavailable (off) and test_file_history_backend_reports_persistence_already_on (on).
  • P2 Codebase refactoring and cleanup  #1privacy_cmds.py::_cmd_privacy: replaced __class__.__name__ == "InMemoryHistory" with isinstance(backend, InMemoryHistory) after from prompt_toolkit.history import InMemoryHistory.
  • P2 Revert "Codebase refactoring and cleanup " #2RedactingFileHistory.__init__: paused is initialized inside __init__ with an explicit bool annotation (self.paused: bool = False), not as a class-level annotation.

Quality gate green: make lint, make format-check, make typecheck, make test-cov (5452 passed, 5 skipped).

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
@muddlebee

Copy link
Copy Markdown
Collaborator

@greptileai review

@muddlebee

Copy link
Copy Markdown
Collaborator

@yashksaini-coder did you see my reviews?

yashksaini-coder added a commit to yashksaini-coder/opensre that referenced this pull request May 7, 2026
…le (Tracer-Cloud#1377)

Addresses @muddlebee's review feedback on PR Tracer-Cloud#1387:

> "this seems very limited redacted key-rules? we should scan the
>  entire project what other keys are used or patterns which we can add"
> "also move this to separate constants file"

## Changes

### Split into a dedicated constants module

- New `app/cli/interactive_shell/redaction_rules.py` holds the
  `RedactionRule` dataclass, the `_RAW_RULES` table grouped by family,
  and the `redact_text` function. Patterns are now easy to audit and
  extend without touching the `RedactingFileHistory` plumbing.
- `history_policy.py` re-exports `RedactionRule`,
  `DEFAULT_REDACTION_RULES`, and `redact_text` so existing imports
  in `history.py`, `privacy_cmds.py`, and the test suite keep working
  unchanged.

### Expanded ruleset (12 → 22 patterns)

Scanned the codebase for credentials env-var names (44 distinct names
across services + integrations) and added rules covering:

**New vendor-specific patterns:**
- `google_api_key` — `AIza...` (Google AI / Cloud / Firebase)
- `github_oauth` — `gho_/ghu_/ghs_/ghr_` (OAuth / user-server /
  server-server / refresh, distinct from existing `ghp_`)
- `gitlab_pat` — `glpat-...`
- `openrouter_key` — `sk-or-v1-...` (with `(?!or-v1-)` lookahead in
  the OpenAI rule so it gets its own descriptive tag)
- `huggingface_token` — `hf_...`
- `stripe_restricted` — `rk_(live|test)_...`

**New connection-string pattern:**
- `db_url_creds` — DSN-style URLs with embedded `user:password@`
  for postgres/postgresql/mysql/mariadb/mongodb/mongodb+srv/redis/
  amqp/amqps. Username and host stay visible (debug-friendly), only
  the password between `:` and `@` is replaced.

**New generic env-style fallbacks** (catch the long tail —
Datadog, Coralogix, Argo CD, Grafana, Honeycomb, Bitbucket, Kafka
SASL, etc.):
- `api_key_env` — `<NAME>_API_KEY=value`
- `token_env` — `<NAME>_(TOKEN|AUTH_TOKEN|BEARER_TOKEN|BOT_TOKEN|PUBLIC_KEY)=value`
- `password_env` — `<NAME>_(PASSWORD|APP_PASSWORD|SASL_PASSWORD)=value`
- `secret_env` — `<NAME>_(SECRET|CLIENT_SECRET|WEBHOOK_SECRET|SIGNING_SECRET)=value`

These require UPPERCASE_NAMES with explicit suffixes so they don't
match inside legitimate code or comments (`api_key = config.get(...)`
in Python source is preserved verbatim).

### Tests

- New `tests/cli/interactive_shell/test_redaction_rules.py` with 40
  cases covering each new pattern, the connection-string family, the
  generic env-style fallbacks, and the lowercase-Python negative case.
- All 442 existing `tests/cli/interactive_shell/` tests still pass.
@yashksaini-coder

Copy link
Copy Markdown
Collaborator Author

@muddlebee — addressing both review comments in efd66e87. Sorry for the delay.

"this seems very limited redacted key-rules? we should scan the entire project what other keys are used or patterns which we can add"

"also move this to separate constants file"

Done

Moved into a constants module. The patterns now live in app/cli/interactive_shell/redaction_rules.pyRedactionRule, _RAW_RULES, redact_text. history_policy.py re-exports the public surface so history.py, privacy_cmds.py, and the existing tests don't need to change.

Expanded ruleset 12 → 22 patterns. Scanned app/services/ and app/integrations/ and pulled out 44 distinct credential env-vars to inform the new ruleset:

Vendor-specific (high-confidence prefixes):

  • google_api_keyAIza... (Google AI / Cloud / Firebase)
  • github_oauthgho_/ghu_/ghs_/ghr_ (OAuth/user-server/server-server/refresh; distinct from existing ghp_)
  • gitlab_patglpat-...
  • openrouter_keysk-or-v1-... (with (?!or-v1-) lookahead in OpenAI rule so it gets its own tag)
  • huggingface_tokenhf_...
  • stripe_restrictedrk_(live|test)_...

Connection strings:

  • db_url_credspostgres/postgresql/mysql/mariadb/mongodb/mongodb+srv/redis/amqp/amqps://user:password@host. Username + host stay visible; only the password between : and @ is replaced.

Generic env-style fallbacks (the long tail):

  • api_key_env<NAME>_API_KEY=value (catches Datadog, Coralogix, Honeycomb, Cursor, …)
  • token_env<NAME>_(TOKEN|AUTH_TOKEN|BEARER_TOKEN|BOT_TOKEN|PUBLIC_KEY)=value (catches Airflow, Argo CD, Jira, GH MCP, Discord, …)
  • password_env<NAME>_(PASSWORD|APP_PASSWORD|SASL_PASSWORD)=value (catches Azure SQL, Bitbucket app, Kafka SASL, …)
  • secret_env<NAME>_(SECRET|CLIENT_SECRET|WEBHOOK_SECRET|SIGNING_SECRET)=value

The generic patterns require UPPERCASE_NAMES with explicit suffixes so they don't match inside legitimate Python source — api_key = config.get("foo") is preserved verbatim. Specific rules run before generic ones so a known shape (e.g. aws_secret_access_key=…) gets its descriptive tag rather than the generic [REDACTED:secret].

Tests

  • New tests/cli/interactive_shell/test_redaction_rules.py40 cases covering each new pattern, every connection-string family, every generic env-style suffix, ordering guarantees (specific-wins-over-generic, PEM block redacted across full block), and the lowercase-Python negative case.
  • All 442 existing tests/cli/interactive_shell/ tests still pass.
  • Lint, format, typecheck all green.

Re-triggering Greptile on this commit. Demo screenshot to follow.

muddlebee added 2 commits May 8, 2026 10:52
- Keep privacy and agents slash command groups in registry order

- Use main loop + prompt_surface; wire prompt history backend on ReplSession

- Align interactive shell tests with prompt_surface API
Avoid direct _max_entries mutation; matches RedactingFileHistory public API.
@muddlebee

Copy link
Copy Markdown
Collaborator

@greptile review

@muddlebee

Copy link
Copy Markdown
Collaborator

@yashksaini-coder fixed conflicts and pushed some nits.

@muddlebee muddlebee merged commit f80251c into main May 8, 2026
12 checks passed
@muddlebee muddlebee deleted the feat/1377-history-privacy branch May 8, 2026 05:37
@github-actions

github-actions Bot commented May 8, 2026

Copy link
Copy Markdown
Contributor

🚀 Houston, we have a merge. @yashksaini-coder your PR is in orbit. Thanks for launching this one!


👋 Join us on Discord - OpenSRE : hang out, contribute, or hunt for features and issues. Everyone's welcome.

@muddlebee

Copy link
Copy Markdown
Collaborator

@yashksaini-coder pls add proper e2e demo, missing in PR checklist

Davidson3556 pushed a commit to Davidson3556/opensre that referenced this pull request May 15, 2026
…rols (Tracer-Cloud#1387)

* feat(cli): redact secrets from interactive history; add /privacy and /history controls

Closes Tracer-Cloud#1377

* fix(cli): resolve history privacy review feedback

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix(cli): restore privacy commands in help

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix(privacy): close greptile gaps on PEM redaction and retention 0

Address Greptile review feedback to lift the score to 5/5:

- P1 #1 (security): the private_key redaction pattern previously matched
  only the BEGIN header line, so pasted multi-line PEM blocks leaked the
  body and END footer to disk. Pattern now spans header through footer
  with a non-greedy body match. Test added that pastes a full RSA, EC,
  and OPENSSH PEM block and asserts both the base64 body and END marker
  are gone after redaction (single-string and history-store paths).

- P1 #2: /history retention 0 now sets the cap by writing
  backend._max_entries directly and only calls _prune_to_cap when n > 0,
  matching the same guard that store_string already uses. Existing
  test_zero_sets_unlimited_without_crashing still passes.

- P1 Tracer-Cloud#3: /history off and /history on already distinguish the three
  backend cases (RedactingFileHistory, plain FileHistory, InMemoryHistory)
  with accurate messages, and dedicated tests cover the FileHistory
  branch for both subcommands.

- P2 #1: backend identity check uses isinstance(backend, InMemoryHistory)
  instead of fragile __class__.__name__ string matching.

- P2 #2: paused is initialized in __init__ with an explicit bool
  annotation rather than as a class-level annotation.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Update app/cli/interactive_shell/history_policy.py

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>

* fix(cli): use set_max_entries for /history retention

Avoid direct _max_entries mutation; matches RedactingFileHistory public API.

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: Anwesh <8139783+muddlebee@users.noreply.github.com>
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Co-authored-by: muddlebee <anweshknayak@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Interactive Shell: harden command history and transcript privacy (redaction + controls)

3 participants