Skip to content

File descriptor leak in workspace scanning on every message #11181

@justinreeves00

Description

@justinreeves00

Bug Report

Summary

Every inbound message triggers the gateway to open every file in ~/.openclaw/workspace/skills/ (recursively, including venv/, .venv/, __pycache__/) as read-only file descriptors that are never closed. After a single message, the FD count jumps from ~53 to thousands, and after a few messages the process hits spawn EBADF — all exec tool calls fail because Node.js can no longer allocate file descriptors for child processes.

Environment

  • OpenClaw version: 2026.2.3-1
  • OS: macOS 26.3 (arm64)
  • Node: 25.5.0
  • Install method: Homebrew (/opt/homebrew/bin/openclaw)
  • Gateway mode: local, LaunchAgent

Reproduction

  1. Have a workspace with skills containing Python venvs or other large dependency trees (~6,000+ files)
  2. Start gateway fresh — lsof -p <pid> | wc -l shows ~53
  3. Send one message via any channel (WhatsApp, Telegram, etc.)
  4. Check again — FD count jumps to ~2,500+ (proportional to workspace file count)
  5. FDs never decrease — confirmed stable over 60+ seconds, not transient processing
  6. After several messages, exec tool fails with spawn EBADF

Diagnostic Evidence

FD count timeline (single gateway restart cycle):

Boot:           53 FDs
After 1 msg:  2,471 FDs  (never returns to baseline)
After 2 msgs: 5,712 FDs

Leaked FD breakdown (lsof):

15,210 REG (regular files)
     7 IPv4
     4 PIPE
     4 DIR
     3 KQUEUE

All leaked FDs are read-only (r suffix) and sequential:

node  PID claw  101r  REG  .../workspace/skills/web-search-plus/FAQ.md
node  PID claw  102r  REG  .../workspace/skills/web-search-plus/README.md
node  PID claw  103r  REG  .../workspace/skills/web-search-plus/SKILL.md
...
node  PID claw 2441r  REG  .../workspace/skills/browser-autologin/.venv/...

File distribution of leaked FDs:

Category Leaked FDs
skills/*/venv/ and .venv/ 2,282
skills/* (actual skill files) 126
Non-workspace (libs, config) 38

Workspace file count vs leak:

Folder Files Notes
skills/ 5,645 Main leak source
projects/ 1,656
research/ 940
node_modules/ 546
Other ~275

Removing large venvs reduces the per-message leak proportionally — confirming the leak scales with file count in the workspace.

What We Ruled Out

  • chokidar FSWatcher (ensureSkillsWatcher / ensureWatcher): Uses fsevents on macOS, which doesn't open per-file FDs. Also, the watcher's ignored patterns skip .git/node_modules/dist but NOT venv/.venv/__pycache__ — worth fixing regardless.
  • loadSkillsFromDir (pi-coding-agent): Uses readdirSync/readFileSync which are synchronous and auto-close. Also skips node_modules but not venv.
  • syncSkillsToWorkspace: Only runs for sandboxed sessions (not applicable here).
  • Security skill scanner: Only scans .js/.ts files, not .py/.so/etc.
  • Memory index sync: Only scans workspace/memory/ subdirectory, not workspace/skills/.

Likely Area

Something in the message processing pipeline opens every file in the workspace skills directory tree read-only and doesn't close the handles. The sequential FD numbers (101, 102, 103...) suggest a single traversal that opens files in directory order. This is NOT readFileSync (which auto-closes) — it's likely an fs.open(), createReadStream(), or similar that returns a handle without closing it.

Workaround

Reduce file count in ~/.openclaw/workspace/skills/ by removing Python venvs and other dependency trees. Adding venv/, .venv/, and __pycache__/ to the workspace .gitignore does not help since the scanner doesn't appear to respect .gitignore.

Suggested Fixes

  1. Find and fix the unclosed file handles in the workspace scanning code path triggered during message processing
  2. Add venv/, .venv/, __pycache__/ to DEFAULT_SKILLS_WATCH_IGNORED in src/agents/skills/refresh.ts (prevents chokidar from watching them even if fsevents isn't available)
  3. Add memory manager cleanup to src/gateway/server-close.ts — the INDEX_CACHE and QMD_MANAGER_CACHE are never closed during gateway shutdown

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions