Skip to content

fix: guard fcntl.flock(LOCK_UN) against OSError in finally blocks#23819

Closed
vanthinh6886 wants to merge 1 commit into
NousResearch:mainfrom
vanthinh6886:fix/flock-guard-unguarded-unlock
Closed

fix: guard fcntl.flock(LOCK_UN) against OSError in finally blocks#23819
vanthinh6886 wants to merge 1 commit into
NousResearch:mainfrom
vanthinh6886:fix/flock-guard-unguarded-unlock

Conversation

@vanthinh6886

Copy link
Copy Markdown
Contributor

Problem

When fcntl.flock(fd, LOCK_UN) raises OSError in a finally block, the subsequent cleanup code (e.g., fd.close()) is skipped. This causes:

  • File descriptor leak (fd never closed)
  • Lock held indefinitely (LOCK_UN never applied)
  • Potential deadlock for subsequent lock attempts

The Windows msvcrt path was already guarded with try/except in all affected files, but the POSIX fcntl path was not.

Affected Files

File Function Risk
agent/shell_hooks.py _allowlist_editor() Lock leak on allowlist edit
tools/skill_usage.py _file_lock() Lock leak on skill usage tracking
tools/memory_tool.py _file_lock() Lock leak on memory operations
tools/environments/file_sync.py sync_back() Lock leak + fd leak (close skipped)
hermes_cli/auth.py _credential_file_lock() Lock leak on credential access

The most dangerous case is file_sync.py where lock_fd.close() follows the unlock — if the unlock raises, the fd leaks and the lock is held forever.

Fix

Wrap each fcntl.flock(fd, LOCK_UN) in try: ... except OSError: pass, matching the defensive pattern already used for the msvcrt path in the same files.

# Before
finally:
    fcntl.flock(fd, fcntl.LOCK_UN)
    fd.close()

# After
finally:
    try:
        fcntl.flock(fd, fcntl.LOCK_UN)
    except OSError:
        pass
    fd.close()

Tests

  • Verified syntax with ast.parse() on all 5 files
  • Each fix is minimal (3 lines added per file)
  • No behavioral change when flock succeeds normally

Fixes #21719

When fcntl.flock(fd, LOCK_UN) raises OSError in a finally block,
the subsequent cleanup code (e.g., fd.close()) is skipped, causing
the file descriptor to leak and the lock to be held indefinitely.

This pattern existed in 5 files:
- agent/shell_hooks.py: _allowlist_editor()
- tools/skill_usage.py: _file_lock()
- tools/memory_tool.py: _file_lock()
- tools/environments/file_sync.py: sync_back()
- hermes_cli/auth.py: _credential_file_lock()

The Windows msvcrt path was already guarded with try/except in all
these files, but the POSIX fcntl path was not.

Fix: wrap each fcntl.flock(LOCK_UN) call in try/except OSError: pass,
matching the defensive pattern already used for the msvcrt path.

Fixes NousResearch#21719
@daimon-nous daimon-nous Bot added type/bug Something isn't working P3 Low — cosmetic, nice to have comp/agent Core agent loop, run_agent.py, prompt builder tool/memory Memory tool and memory providers duplicate This issue or pull request already exists labels May 11, 2026
@daimon-nous

daimon-nous Bot commented May 11, 2026

Copy link
Copy Markdown
Contributor

Duplicate of #16274 — same fcntl.flock(LOCK_UN) guard fix across the same files. Also overlaps with #21719 and #20529.

@teknium1

Copy link
Copy Markdown
Contributor

This appears to be implemented on current main, so this PR can be closed as already fixed. This is an automated hermes-sweeper review.

Evidence:

  • The exact requested POSIX unlock guard is present in all five PR files: agent/shell_hooks.py:635, tools/skill_usage.py:112, tools/memory_tool.py:233, tools/environments/file_sync.py:293, and hermes_cli/auth.py:1011.
  • The most important cleanup case from the PR body is covered: tools/environments/file_sync.py:293 catches OSError/IOError from fcntl.flock(..., LOCK_UN) and still reaches lock_fd.close() at line 297.
  • The implementation landed in 62573f44cfee8895eec2cb18e7c45b9bff97081a (fix: guard yaml.safe_load, flock unlock, TOCTOU races, and atomic writes), whose commit message explicitly lists the same fcntl unlock fix across these files.
  • That commit is contained in release tag v2026.5.28.

Thanks for the focused cleanup here; the prior duplicate note on #16274/#21719/#20529 matches what main now contains.

@teknium1 teknium1 closed this Jun 11, 2026
@teknium1 teknium1 added the sweeper:implemented-on-main Sweeper: behavior already present on current main label Jun 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/agent Core agent loop, run_agent.py, prompt builder duplicate This issue or pull request already exists P3 Low — cosmetic, nice to have sweeper:implemented-on-main Sweeper: behavior already present on current main tool/memory Memory tool and memory providers type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants