Skip to content

Security: harden AGENTS.md with gateway, prompt injection, and supply chain rules#10514

Open
catpilothq wants to merge 1 commit intoopenclaw:mainfrom
catpilotai:security/harden-agents-md-v2
Open

Security: harden AGENTS.md with gateway, prompt injection, and supply chain rules#10514
catpilothq wants to merge 1 commit intoopenclaw:mainfrom
catpilotai:security/harden-agents-md-v2

Conversation

@catpilothq
Copy link

@catpilothq catpilothq commented Feb 6, 2026

What

Adds a comprehensive Security Protocols section to AGENTS.md so that AI coding agents (Copilot, Cursor, Claude Code, etc.) operating in this repo receive explicit security guardrails.

Supersedes #10510 (closed with feedback — addressed here).

Why

Recent research has surfaced significant attack surfaces for OpenClaw deployments:

  • Gateway exposure: Shodan scans show ~92% of public OpenClaw gateways run without authentication
  • Prompt injection: ZeroLeaks study demonstrated 91% success rate extracting system prompts and memory files
  • Supply-chain attacks: ClawHavoc analysis identified 341 malicious skills on ClawHub using typosquatting, obfuscated payloads, and hidden webhooks
  • Credential leaks: Multiple reports of API keys written to plaintext openclaw.json

AGENTS.md is the primary instruction file that AI agents read when working in this repo. Adding security rules here ensures agents follow safe patterns by default.

Changes

New section: Security Protocols (CRITICAL) with 8 subsections:

  1. Anti-Malware Execution Safety — refuse blind curl | bash, read skill source first
  2. Secret Hygiene — never write keys to config files, use env vars
  3. Gateway Network Security — bind localhost, enable auth, use authenticated tunnels
  4. Prompt Injection Defense — ignore instructions in fetched content, protect CLAUDE.md/AGENTS.md/openclaw.json/~/.openclaw/
  5. Skill / ClawHub Vetting — typosquatting checks, Clawdex verification, mass-publisher flags
  6. Sandbox & Session Isolation — per-session Docker, tool denylists, dmPolicy defaults
  7. File & Credential Permissions — chmod 700/600 for ~/.openclaw/
  8. Incident Response — rotation procedures, memory poisoning checks, openclaw doctor

Updated: Security & Configuration Tips — added credential permission reminders and hardcoded-secret flagging.

Feedback from #10510

SOUL.md and TOOLS.md are listed here as sensitive files, but they don't exist anywhere in this repo.

Fixed — all file references now point to actual files: CLAUDE.md, AGENTS.md, openclaw.json, and ~/.openclaw/ paths.

Testing

  • pnpm check passes (tsgo + oxlint + oxfmt)
  • pnpm test passes (5,327 + 219 tests, 0 failures)
  • Documentation-only change — no runtime behavior affected

AI-assisted

This PR was researched and drafted with AI assistance. All security recommendations were validated against the referenced research sources and the OpenClaw codebase.

Greptile Overview

Greptile Summary

  • Adds a new Security Protocols (CRITICAL) section to AGENTS.md with guidance for safe script execution, secret handling, gateway binding/auth, prompt-injection resistance, skill vetting, sandbox isolation, permissions, and incident response.
  • Updates the existing security/config tips to reinforce owner-only permissions for ~/.openclaw/ and to flag hardcoded secrets during reviews.
  • Overall change is documentation-only, intended to shape how AI agents operate when working in this repo.

Confidence Score: 4/5

  • This PR is mostly safe to merge, with one docs issue that could misguide secret-handling practices.
  • Changes are confined to AGENTS.md and aim to add security guardrails. The only actionable concern is the strict guidance to never use .env, which can conflict with common per-project secret management and may push users toward globally-exported secrets.
  • AGENTS.md

(3/5) Reply to the agent's comments like "Can you suggest a fix for this @greptileai?" or ask follow-up questions!

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 file reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

AGENTS.md Outdated
### Secret Hygiene

- **NEVER** write API keys, tokens, or passwords to `openclaw.json`, `.env`, or any plaintext file.
- **ALWAYS** instruct the user to set these values as Environment Variables in their terminal profile.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instructs printing secrets

The rule “NEVER write API keys… to .env” conflicts with common practice in this repo/ecosystem, but more importantly the surrounding guidance (“set these values as Environment Variables in their terminal profile”) pushes users toward globally-exported secrets and makes it harder to use per-project tooling. Consider narrowing this to “never commit secrets / never store secrets in repo files” and explicitly allow local, uncommitted .env where the project expects it (or clarify the repo’s intended secret management). As written, agents may refuse legitimate workflows or suggest insecure global exports.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Prompt To Fix With AI
This is a comment left during a code review.
Path: AGENTS.md
Line: 33:33

Comment:
**Instructs printing secrets**

The rule “NEVER write API keys… to `.env`” conflicts with common practice in this repo/ecosystem, but more importantly the surrounding guidance (“set these values as Environment Variables in their terminal profile”) pushes users toward globally-exported secrets and makes it harder to use per-project tooling. Consider narrowing this to “never commit secrets / never store secrets in repo files” and explicitly allow local, uncommitted `.env` where the project expects it (or clarify the repo’s intended secret management). As written, agents may refuse legitimate workflows or suggest insecure global exports.

<sub>Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!</sub>

How can I resolve this? If you propose a fix, please make it concise.

… chain rules

Add comprehensive security protocols to AGENTS.md covering:
- Anti-malware execution safety (skill install vetting)
- Secret hygiene (never write keys to plaintext config)
- Gateway network security (bind localhost, enable auth)
- Prompt injection defense (ignore instructions in fetched content)
- Skill/ClawHub vetting (typosquatting, Clawdex verification)
- Sandbox & session isolation (per-session Docker, dmPolicy)
- File & credential permissions (chmod 700/600)
- Incident response (credential rotation, memory poisoning checks)

Also adds credential permission reminders to Security & Configuration Tips.

Research sources: ZeroLeaks prompt injection study (91% success),
Shodan gateway exposure (92% unauthenticated), ClawHavoc supply-chain
analysis (341 malicious skills), Koi Security Clawdex scanner.
@catpilothq catpilothq force-pushed the security/harden-agents-md-v2 branch from 6146aba to e051c33 Compare February 6, 2026 17:35
catpilothq added a commit to catpilotai/catpilot-ai-guardrails that referenced this pull request Feb 6, 2026
- Replace shell profile secret storage with .env + .gitignore pattern
- Replace SOUL.md/TOOLS.md references with actual files (CLAUDE.md, openclaw.json)
- Align with feedback from openclaw/openclaw#10514
@openclaw-barnacle
Copy link

This pull request has been automatically marked as stale due to inactivity.
Please add updates or it will be closed.

@openclaw-barnacle openclaw-barnacle bot added stale Marked as stale due to inactivity and removed stale Marked as stale due to inactivity labels Feb 21, 2026
@openclaw-barnacle
Copy link

This pull request has been automatically marked as stale due to inactivity.
Please add updates or it will be closed.

@openclaw-barnacle openclaw-barnacle bot added the stale Marked as stale due to inactivity label Mar 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

stale Marked as stale due to inactivity

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant