Skip to content

feat(gateway): add audit logging for gateway tool calls#63557

Open
HOYALIM wants to merge 12 commits intoopenclaw:mainfrom
HOYALIM:feat/gateway-tool-call-audit-log
Open

feat(gateway): add audit logging for gateway tool calls#63557
HOYALIM wants to merge 12 commits intoopenclaw:mainfrom
HOYALIM:feat/gateway-tool-call-audit-log

Conversation

@HOYALIM
Copy link
Copy Markdown
Contributor

@HOYALIM HOYALIM commented Apr 9, 2026

Summary

Describe the problem and fix in 2–5 bullets:

  • Problem: gateway-originated tool calls did not have a centralized, append-only audit trail across direct /tools/invoke requests and OpenResponses ingress runs.
  • Why it matters: operators could not reliably audit tool execution from cron/isolated/gateway-originated paths, and manual log inspection could miss what tool ran with what effective arguments.
  • What changed: added a dedicated gateway-tool-audit.jsonl helper with tool-argument redaction, threaded gateway audit context through ingress → embedded tool execution, and wired both /tools/invoke and /v1/responses into the same centralized JSONL audit sink.
  • What did NOT change (scope boundary): no new query UI, rotation policy, HMAC signing, or broader analytics/reporting surface was added in this PR.

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor required for the fix
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

Root Cause (if applicable)

For bug fixes or regressions, explain why this happened, not just what changed. Otherwise write N/A. If the cause is unclear, write Unknown.

  • Root cause: gateway ingress surfaces had no shared audit sink for tool execution, so direct /tools/invoke and OpenResponses-originated tool calls lacked a centralized append-only record.
  • Missing detection / guardrail: focused tests covered ingress behavior, but there was no branch-local assertion that gateway-originated tool calls produced a real JSONL audit record with redacted args.
  • Contributing context (if known): tool execution already flowed through shared wrapper logic, but gateway-only audit requirements were not modeled explicitly in hook context.

Regression Test Plan (if applicable)

For bug fixes or regressions, name the smallest reliable test coverage that should catch this. Otherwise write N/A.

  • Coverage level that should have caught this:
    • Unit test
    • Seam / integration test
    • End-to-end test
    • Existing coverage already sufficient
  • Target test or file:
    • src/gateway/tool-audit.test.ts
    • src/gateway/tools-invoke-http.test.ts
    • src/gateway/openresponses-http.test.ts
  • Scenario the test should lock in: gateway-originated tool calls write a centralized JSONL audit record with redacted args, and both ingress surfaces propagate the audit context correctly.
  • Why this is the smallest reliable guardrail: the behavior is split across one helper and two gateway ingress seams; these tests cover the storage helper, direct /tools/invoke ingress, and OpenResponses ingress without needing broad runtime E2E.
  • Existing test that already covers this (if any): none for the new JSONL audit sink before this PR.
  • If no new test is added, why not: N/A

User-visible / Behavior Changes

Gateway-originated tool calls now append structured audit records to ~/.openclaw/logs/gateway-tool-audit.jsonl with redacted tool args and metadata including surface, session, channel, model, runId, and toolCallId.

Diagram (if applicable)

For UI changes or non-trivial logic flows, include a small ASCII diagram reviewers can scan quickly. Otherwise write N/A.

Before:
[gateway ingress] -> [tool execution] -> [no centralized tool-call audit file]

After:
[gateway ingress] -> [gateway audit context] -> [shared hook] -> [gateway-tool-audit.jsonl] -> [tool execution]

Security Impact (required)

  • New permissions/capabilities? (No)
  • Secrets/tokens handling changed? (Yes)
  • New/changed network calls? (No)
  • Command/tool execution surface changed? (No)
  • Data access scope changed? (Yes)
  • If any Yes, explain risk + mitigation:
    • Tool-call args are now persisted to a new gateway audit JSONL file.
    • Mitigation: args are redacted via existing logging.redactSensitive="tools" behavior before write, and records are written under the existing state/logs path with explicit file-mode restrictions.

Repro + Verification

Environment

  • OS: macOS (local dev)
  • Runtime/container: Node 22 + pnpm
  • Model/provider: OpenResponses model: openclaw; direct /tools/invoke has no model
  • Integration/channel (if any): gateway /tools/invoke and /v1/responses
  • Relevant config (redacted): default logging redaction (logging.redactSensitive="tools")

Steps

  1. Trigger a direct gateway /tools/invoke request for an allowed tool.
  2. Trigger an OpenResponses request that reaches gateway-originated tool execution.
  3. Inspect ~/.openclaw/logs/gateway-tool-audit.jsonl.

Expected

  • Each gateway-originated tool call appends one JSONL record.
  • Record includes surface, tool, session, channel, model, runId, and toolCallId when available.
  • Sensitive tool args are redacted before writing.

Actual

  • Targeted tests and manual JSONL inspection confirmed both ingress paths now propagate audit context, and the emitted JSONL record redacts secret-like values.

Evidence

Attach at least one:

  • Failing test/log before + passing after
  • Trace/log snippets
  • Screenshot/recording
  • Perf numbers (if relevant)

Trace/log snippets:

  • {"ts":"2026-04-08T20:00:00.000Z","source":"gateway","event":"tool.call","surface":"tools-invoke","tool":"exec","args":{"command":"OPENAI_API_KEY=sk-sec…7890"},"session":"agent:main:main","channel":"discord","model":null,"runId":"run-1","toolCallId":"call-1"}

Human Verification (required)

What you personally verified (not just CI), and how:

  • Verified scenarios:
    • pnpm test -- src/gateway/tool-audit.test.ts src/gateway/tools-invoke-http.test.ts src/gateway/openresponses-http.test.ts
    • manual JSONL append/read check showing redacted tool args in the emitted audit record
  • Edge cases checked:
    • direct /tools/invoke path actually emits an audit record (not just context propagation)
    • OpenResponses ingress carries requested model into audit metadata
    • secret-like values inside tool args are redacted before file write
  • What you did not verify:
    • full pnpm check / pnpm build as a branch-local clean gate, because latest origin/main still has unrelated pre-existing failures in extensions/msteams and src/agents/skills*

Review Conversations

  • I replied to or resolved every bot review conversation I addressed in this PR.
  • I left unresolved only the conversations that still need reviewer or maintainer judgment.

If a bot review conversation is addressed by this PR, resolve that conversation yourself. Do not leave bot review conversation cleanup for maintainers.

Compatibility / Migration

  • Backward compatible? (Yes)
  • Config/env changes? (No)
  • Migration needed? (No)
  • If yes, exact upgrade steps:

Risks and Mitigations

List only real risks for this PR. Add/remove entries as needed. If none, write None.

  • Risk: audit file growth over time because this PR adds append-only JSONL writes without rotation.
    • Mitigation: scope is intentionally minimal for the initial audit trail; follow-up retention/query work stays separate.
  • Risk: tool args may contain organization-specific secrets not covered by default patterns.
    • Mitigation: existing logging.redactPatterns configuration is honored by the shared redaction helper.

Copilot AI review requested due to automatic review settings April 9, 2026 05:10
@openclaw-barnacle openclaw-barnacle Bot added gateway Gateway runtime agents Agent runtime and tooling size: M labels Apr 9, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a centralized, append-only gateway audit trail for tool execution that works across both direct POST /tools/invoke and OpenResponses (POST /v1/responses) ingress, with tool-argument redaction and propagation through embedded tool execution.

Changes:

  • Introduces a gateway-tool-audit.jsonl writer + redaction helper and unit tests.
  • Wires /tools/invoke to append an audit record per tool call.
  • Threads gatewayToolAudit context from OpenResponses ingress into embedded runner/tool execution so wrapped tools can emit audit records.

Reviewed changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
src/gateway/tools-invoke-http.ts Appends a gateway tool-call audit record for direct /tools/invoke executions.
src/gateway/tools-invoke-http.test.ts Mocks/assersts audit append behavior for /tools/invoke.
src/gateway/tool-audit.ts New audit JSONL path resolution, argument sanitization/redaction, and append helper.
src/gateway/tool-audit.test.ts Verifies JSONL write location and that secrets are redacted.
src/gateway/openresponses-http.ts Propagates gateway audit context (incl. effective model) into agent command opts.
src/gateway/openresponses-http.test.ts Asserts OpenResponses ingress sets gatewayToolAudit fields.
src/agents/pi-tools.ts Threads gatewayToolAudit option into tool hook context construction.
src/agents/pi-tools.before-tool-call.ts Emits audit records from the tool wrapper when gatewayToolAudit context is present.
src/agents/pi-embedded-runner/run/params.ts Adds gatewayToolAudit to embedded runner params type.
src/agents/pi-embedded-runner/run/attempt.ts Forwards gatewayToolAudit into embedded attempt execution.
src/agents/command/types.ts Adds gatewayToolAudit to AgentCommandOpts.
src/agents/command/attempt-execution.ts Forwards gatewayToolAudit from command opts into embedded runner invocation.

Comment thread src/gateway/tool-audit.ts
Comment thread src/gateway/tool-audit.ts
Comment thread src/gateway/tools-invoke-http.ts
Comment thread src/agents/pi-tools.before-tool-call.ts
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 9, 2026

Greptile Summary

Adds a centralized append-only JSONL audit sink (gateway-tool-audit.jsonl) for gateway-originated tool calls, threading audit context through both the /tools/invoke direct path and the OpenResponses ingress. Redaction is applied via the existing logging.redactSensitive="tools" helper before any write. All remaining findings are P2 style/hardening suggestions and do not block merge.

Confidence Score: 5/5

Safe to merge; all findings are P2 style/hardening suggestions with no correctness or data-loss impact.

Core audit logic (redaction, record shape, path resolution, error swallowing) is correct and consistent across both ingress surfaces. Four P2 findings: file-mode enforcement gap for pre-existing files, non-serializable-args fallback losing structure, duplicated context object in tools-invoke-http.ts, and a misleading test assertion due to a passthrough mock. None affect correctness or produce wrong audit records under normal conditions.

src/gateway/tool-audit.ts (file-mode and args-fallback), src/gateway/tools-invoke-http.ts (context duplication)

Vulnerabilities

  • Tool args containing secrets are redacted before persisting via the shared redactSensitiveText helper — correct.
  • The mode: 0o600 on fs.appendFile and mode: 0o700 on fs.mkdir are silently ignored when the file/directory already exists (src/gateway/tool-audit.ts lines 77–78). The stated mitigation ("explicit file-mode restrictions") only holds for the initial creation; a fs.chmod call after append would close the gap.
  • Blocked/denied tool calls are not emitted to the audit log in either ingress path, which limits post-incident reconstruction of denied attempts.
Prompt To Fix All With AI
This is a comment left during a code review.
Path: src/gateway/tool-audit.ts
Line: 77-78

Comment:
**File mode not enforced on pre-existing audit file**

`fs.appendFile`'s `mode` option and `fs.mkdir`'s `mode` option are both ignored when the target already exists. On first run the file and directory get the intended `0o600`/`0o700` permissions, but on subsequent runs (or if the file was pre-created with wider permissions) those modes are silently skipped. The PR description cites "explicit file-mode restrictions" as a security mitigation, but that guarantee only holds for the initial creation.

Consider adding a `fs.chmod` call after creating the file so permissions are enforced on every run, not just the first:

```suggestion
  await fs.mkdir(path.dirname(auditPath), { recursive: true, mode: 0o700 });
  await fs.appendFile(auditPath, `${JSON.stringify(params.record)}\n`, { encoding: "utf8", mode: 0o600 });
  await fs.chmod(auditPath, 0o600).catch(() => undefined);
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: src/gateway/tool-audit.ts
Line: 38-45

Comment:
**Fallback for non-serializable args loses all structured data**

When `JSON.stringify(args)` throws (e.g., circular references or `BigInt` values in tool args), the catch branch returns `redactSensitiveText(String(args), ...)`. For any plain object, `String(args)` produces `"[object Object]"`, discarding all arg detail. A sentinel value is more informative:

```suggestion
export function sanitizeGatewayToolAuditArgs(args: unknown): unknown {
  try {
    const raw = JSON.stringify(args ?? null);
    const redacted = redactSensitiveText(raw, { mode: "tools" });
    return JSON.parse(redacted) as unknown;
  } catch {
    // Non-serializable args (circular refs, BigInt, etc.) — return a safe sentinel
    return { _unserializable: true };
  }
}
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: src/gateway/tools-invoke-http.ts
Line: 292-306

Comment:
**Audit context object duplicated between hook call and record creation**

The same `{ surface, sessionKey, messageChannel, model }` object is built twice — once inside `runBeforeToolCallHook`'s `ctx.gatewayToolAudit` (lines 277–282) and again here inside `createGatewayToolAuditRecord`. If either copy is updated without the other, the hook context and the persisted record drift. Extract to a shared local variable.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: src/gateway/tools-invoke-http.test.ts
Line: 422-432

Comment:
**Test assertion checks a `ctx` field that doesn't exist on the real `GatewayToolAuditRecord`**

`createGatewayToolAuditRecord` is mocked as a passthrough (`vi.fn((params) => params)`), so the "record" seen by `appendGatewayToolAuditRecord` still carries a `ctx` field. In production, `createGatewayToolAuditRecord` flattens `ctx` into `surface`, `session`, `channel`, `model` — no `ctx` key exists on the persisted record. The data-flow check is valid, but consider asserting on the flattened fields or adding a comment explaining the passthrough mock.

How can I resolve this? If you propose a fix, please make it concise.

Reviews (1): Last reviewed commit: "fix(gateway): audit direct tools-invoke ..." | Re-trigger Greptile

Comment thread src/gateway/tool-audit.ts Outdated
Comment thread src/gateway/tool-audit.ts
Comment thread src/gateway/tools-invoke-http.ts
Comment thread src/gateway/tools-invoke-http.test.ts
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 6d58001930

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/gateway/tool-audit.ts Outdated
HOYALIM and others added 5 commits April 9, 2026 14:36
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
@HOYALIM HOYALIM force-pushed the feat/gateway-tool-call-audit-log branch from 6d58001 to 0cebea1 Compare April 9, 2026 21:50
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a7aec1d9e3

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/agents/command/attempt-execution.ts Outdated
@openclaw-barnacle openclaw-barnacle Bot added the extensions: memory-core Extension: memory-core label Apr 9, 2026
@openclaw-barnacle openclaw-barnacle Bot removed the extensions: memory-core Extension: memory-core label Apr 11, 2026
@vincentkoc vincentkoc added the triage: dirty-candidate Candidate: broad unrelated surfaces; may need splitting or cleanup. label Apr 26, 2026
@clawsweeper
Copy link
Copy Markdown
Contributor

clawsweeper Bot commented Apr 30, 2026

Codex review: found issues before merge.

Summary
The PR adds a gateway tool-call audit JSONL helper, threads audit context through /tools/invoke and OpenResponses-originated agent runs, and updates focused gateway plus shared guardrail tests.

Reproducibility: yes. The skipped-audit behavior is reproducible by static diff inspection: both proposed append sites are after blocked-call exits, and current main shows those denied branches return before execution.

Next step before merge
Security-sensitive audit behavior and stale current-main routing make maintainer/author branch updates safer than an automated replacement fix PR.

Security
Needs attention: The patch adds a security-sensitive audit log but omits denied gateway tool attempts from that log.

Review findings

  • [P2] Audit denied direct gateway tool calls — src/gateway/tools-invoke-http.ts:294
  • [P2] Audit denied embedded gateway tool calls — src/agents/pi-tools.before-tool-call.ts:440
  • [P3] Add the required changelog entry — CHANGELOG.md:7
Review details

Best possible solution:

Land a gateway-owned audit implementation on the current shared tool invocation and hook paths that records allowed and denied attempts with redacted args, status/reason metadata, focused tests, and a changelog entry.

Do we have a high-confidence way to reproduce the issue?

Yes. The skipped-audit behavior is reproducible by static diff inspection: both proposed append sites are after blocked-call exits, and current main shows those denied branches return before execution.

Is this the best way to solve the issue?

No. The direction is right, but the implementation should audit denied outcomes before returning and should be retargeted to current tools-invoke-shared.ts rather than the older direct HTTP-only path.

Full review comments:

  • [P2] Audit denied direct gateway tool calls — src/gateway/tools-invoke-http.ts:294
    The new /tools/invoke audit append runs only after hookResult.blocked returns. A before-tool hook denial, approval refusal, or loop block currently responds with 403 and no gateway-tool-audit.jsonl row, leaving high-value denied attempts out of the audit trail. Write a redacted denied record before returning.
    Confidence: 0.89
  • [P2] Audit denied embedded gateway tool calls — src/agents/pi-tools.before-tool-call.ts:440
    appendGatewayToolAuditEntry is inserted after the wrapped agent tool's blocked branch. OpenResponses-originated tool calls vetoed by plugin policy or approval handling return or throw before the new audit helper runs, so record the denied attempt and reason inside the blocked path.
    Confidence: 0.86
  • [P3] Add the required changelog entry — CHANGELOG.md:7
    This PR adds user-facing gateway audit logging and security behavior but does not update CHANGELOG.md. Repo policy requires user-facing feat, fix, and perf changes to have an Unreleased entry.
    Confidence: 0.91

Overall correctness: patch is incorrect
Overall confidence: 0.85

Security concerns:

  • [medium] Denied direct tool attempts are not audited — src/gateway/tools-invoke-http.ts:294
    The direct /tools/invoke audit write is placed after the blocked-call return, so policy, approval, or loop denials are missing from the audit trail needed for incident reconstruction.
    Confidence: 0.89
  • [medium] Denied embedded tool attempts are not audited — src/agents/pi-tools.before-tool-call.ts:440
    The embedded agent-tool audit write is also after the blocked branch, so OpenResponses-originated denied tool attempts are absent from the proposed JSONL audit log.
    Confidence: 0.86

Acceptance criteria:

  • pnpm test -- src/gateway/tool-audit.test.ts src/gateway/tools-invoke-http.test.ts src/gateway/openresponses-http.test.ts src/agents/pi-tools.before-tool-call.embedded-mode.test.ts
  • pnpm check:changed

What I checked:

Likely related people:

  • steipete: Current-main blame and history point to Peter Steinberger as the dominant maintainer across the gateway tool-invoke, OpenResponses, and before-tool-call hook surfaces, including prior /tools/invoke hardening. (role: recent maintainer; confidence: high; commits: 767fd9f222fd, 06110de6f61e; files: src/gateway/tools-invoke-http.ts, src/gateway/tools-invoke-shared.ts, src/gateway/openresponses-http.ts)
  • ai-hpc: Authored merged feat(gateway): add SDK-facing tools.invoke RPC #74804, which introduced the current shared tools.invoke RPC/helper path that this PR now needs to target when rebased. (role: adjacent owner; confidence: medium; commits: 6b04521d6a41; files: src/gateway/tools-invoke-shared.ts, src/gateway/server-methods/tools-invoke.ts, packages/sdk/src/client.ts)
  • vincentkoc: Recent gateway test and regression-harness work touches the same /tools/invoke guardrail area and shared test stability context. (role: recent maintainer; confidence: medium; commits: 688327311c5e; files: src/gateway/tools-invoke-http.test.ts)
  • Vignesh Natarajan: Introduced the original /tools/invoke HTTP endpoint that the PR is extending with audit logging. (role: introduced behavior; confidence: medium; commits: f1083cd52cf4; files: src/gateway/tools-invoke-http.ts)

Remaining risk / open question:

  • The branch is stale against current main's shared invokeGatewayTool path, so the audit implementation needs a rebase and likely retargeting before merge.
  • The required user-facing changelog entry is still missing.

Codex review notes: model gpt-5.5, reasoning high; reviewed against 5b2a0fbac198.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling extensions: memory-wiki gateway Gateway runtime size: M triage: dirty-candidate Candidate: broad unrelated surfaces; may need splitting or cleanup.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feature: Gateway-level audit logging for all tool calls

3 participants