Skip to content

[Bug]: Memory/skills chokidar watcher can exhaust FDs (FSEventWrap) when watched dirs have many files #41606

@JaneyChat

Description

@JaneyChat

Bug type

Behavior bug (incorrect output/state without crash)

Summary

What happens

When the gateway runs and the workspace has a large number of files under paths watched by the memory and skills chokidar watchers (e.g. memory/ or extraPaths with many session/memory files), the process can open 9,000+ file descriptors (FSEventWrap). That leads to fd exhaustion and spawn EBADF when the exec tool tries to spawn a child process.

How we found it

Using process.getActiveResourcesInfo() (Node 22+) at a point where nfd was high, we saw:

  • FSEventWrap=9231 (and FSReqPromise=2983, etc.), i.e. the event loop was holding thousands of file watchers.

The memory watcher watches memory/, MEMORY.md, memory.md, and extraPaths. The skills watcher watches skills/, config skills dir, and extra dirs. Chokidar by default recurses without a depth limit, so a large tree under those paths creates one FSEvent per file and exhausts FDs.

Suggested fix

Add a depth limit to chokidar in both places so we don’t recurse into huge trees:

  1. memorysrc/memory/manager.ts
    In the chokidar.watch(...) call inside ensureWatcher(), add:

    • depth: 2
    • (If not already there) ignored for .git, node_modules, dist to avoid watching dependency/build trees.
  2. skillssrc/agents/skills/refresh.ts
    In the chokidar.watch(watchPaths, { ... }) call, add:

    • depth: 2

This keeps “watch for changes” behavior for typical shallow memory/skills layouts while avoiding FSEventWrap explosion when those dirs contain many files (e.g. accumulated session/memory data). Related: #1056 (FD exhaustion from file watcher) was fixed with ignore patterns for the skills watcher; the memory watcher still had no depth limit and could trigger the same class of issue when memory/ or extraPaths are large.

Steps to reproduce

  1. Use a workspace that has a large number of files under paths watched by memory/skills (e.g. many files in workspace/memory/ or in extraPaths, or accumulated session/memory data).
  2. Start the gateway (e.g. ahaclaw gateway run).
  3. Connect a client (e.g. Obsidian) and send a message that triggers the agent to run an exec command (e.g. "run ls -la and show me the output").

Expected behavior

Exec tool spawns the child process with pipe stdio and returns command output to the user. Gateway stays within reasonable fd count.

Actual behavior

Process holds 9,000+ file descriptors (FSEventWrap from chokidar). Spawn fails with EBADF; exec either fails or runs with stdio ignored so output is lost. Gateway may also crash on unhandled rejection (e.g. from ciao's NetworkManager spawn) when fd limit is hit.

OpenClaw version

2026.2.6

Operating system

macOS 26.3

Install method

clone + pnpm install

Logs, screenshots, and evidence

Evidence from gateway with AHACLAW_STDIO_DIAGNOSTIC=1:

[stdio-diagnostic] lock-after-mkdir fd0=ok fd1=ok fd2=ok nfd=7845 spawn(2pipe)=ok
[stdio-diagnostic] lock-after-mkdir resources(12689): FSEventWrap=9231 FSReqPromise=2983 Timeout=465 TTYWrap=3 UDPWrap=3 TCPServerWrap=2 TCPSocketWrap=2
[stdio-diagnostic] attempt-after-acquireLock nfd=11661 spawn(2pipe)=fail(null)

Impact and severity

Users with large memory/ or skills trees (e.g. many session files) are affected. Severity: blocks exec tool and can crash gateway. Frequency: whenever watched dirs exceed roughly thousands of files. Consequence: agent cannot run shell commands or show output; gateway may exit on spawn EBADF.

Additional information

Not a regression. Workaround: reduce files under memory/ and watched paths, or apply the suggested fix (chokidar depth: 2).

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingbug:behaviorIncorrect behavior without a crash

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions