Skip to content

fix: guard fcntl.flock(LOCK_UN) in finally blocks against OSError#21719

Closed
vominh1919 wants to merge 1 commit into
NousResearch:mainfrom
vominh1919:fix/flock-guard-finally-blocks
Closed

fix: guard fcntl.flock(LOCK_UN) in finally blocks against OSError#21719
vominh1919 wants to merge 1 commit into
NousResearch:mainfrom
vominh1919:fix/flock-guard-finally-blocks

Conversation

@vominh1919

Copy link
Copy Markdown
Contributor

Problem

Six files across the codebase call fcntl.flock(fd, LOCK_UN) inside finally blocks without wrapping it in try/except OSError. If the unlock raises (e.g. the file descriptor was already closed by the OS after a crash, or NFS issues), the exception propagates and skips the subsequent fd.close() — leaving the lock held forever.

The Windows msvcrt equivalent was already guarded with try/except (OSError, IOError): pass in all six locations. This fix brings the POSIX fcntl path to parity.

Fix

Wrap each fcntl.flock(fd, LOCK_UN) in try/except OSError: pass inside the finally blocks:

File Line Impact
tools/skill_usage.py 89 Skill usage tracking lock
tools/memory_tool.py 172 Memory tool file lock
tools/environments/file_sync.py 292 Environment sync lock — followed by lock_fd.close()
agent/shell_hooks.py 627 Shell hooks allowlist lock
hermes_cli/auth.py 917 Auth credential lock
cron/scheduler.py 1730 Scheduler tick lock — followed by lock_fd.close()

Pattern

# Before (POSIX path unguarded)
finally:
    if fcntl:
        fcntl.flock(fd, fcntl.LOCK_UN)    # raises → close() skipped
    elif msvcrt:
        try:
            msvcrt.locking(...)
        except (OSError, IOError):
            pass
    fd.close()

# After (both paths guarded)
finally:
    if fcntl:
        try:
            fcntl.flock(fd, fcntl.LOCK_UN)
        except OSError:
            pass
    elif msvcrt:
        try:
            msvcrt.locking(...)
        except (OSError, IOError):
            pass
    fd.close()

Tests

Each fix is a 3-line addition (try:, except OSError:, pass) that cannot break existing behavior — it only prevents an exception from propagating. All 6 files pass syntax validation.

If fcntl.flock(fd, LOCK_UN) raises OSError in a finally block, the
subsequent fd.close() is skipped and the lock is held forever. This
affected 6 files across tools/, agent/, hermes_cli/, and cron/.

The Windows msvcrt equivalent was already guarded with try/except in
all locations — this fix brings the POSIX fcntl path to parity.

Files fixed:
- tools/skill_usage.py
- tools/memory_tool.py
- tools/environments/file_sync.py
- agent/shell_hooks.py
- hermes_cli/auth.py
- cron/scheduler.py
@alt-glitch alt-glitch added type/bug Something isn't working P3 Low — cosmetic, nice to have comp/agent Core agent loop, run_agent.py, prompt builder tool/memory Memory tool and memory providers labels May 8, 2026
@alt-glitch

Copy link
Copy Markdown
Collaborator

Duplicate of #16274 (same fix, same files) and #20529. PR #16274 covers 5 files and was opened earlier; this PR adds skill_usage.py as a 6th file but is otherwise identical.

@vanthinh6886

Copy link
Copy Markdown
Contributor

Submitted a fix: #23819

Fixed 5 unguarded fcntl.flock(LOCK_UN) sites across agent/shell_hooks.py, tools/skill_usage.py, tools/memory_tool.py, tools/environments/file_sync.py, and hermes_cli/auth.py. Each is now wrapped in try/except OSError: pass matching the existing msvcrt guard pattern.

@teknium1

Copy link
Copy Markdown
Contributor

Automated hermes-sweeper review: this fix is already present on current main.

Evidence:

  • 62573f44cfee8895eec2cb18e7c45b9bff97081a landed the same guard pattern for the six files named in this PR: cron/scheduler.py, hermes_cli/auth.py, agent/shell_hooks.py, tools/skill_usage.py, tools/environments/file_sync.py, and tools/memory_tool.py.
  • agent/shell_hooks.py:635, cron/scheduler.py:2199, hermes_cli/auth.py:1011, tools/environments/file_sync.py:293, tools/memory_tool.py:233, and tools/skill_usage.py:112 now wrap the POSIX fcntl.flock(..., LOCK_UN) call in try/except (OSError, IOError): pass.
  • That exception tuple is a superset of the requested except OSError guard, and it preserves the intended follow-up cleanup such as lock_fd.close().
  • The implementation commit is contained in release tag v2026.5.28.

Thanks for the report and patch; closing this PR because the requested change has already landed on main.

@teknium1 teknium1 closed this Jun 11, 2026
@teknium1 teknium1 added the sweeper:implemented-on-main Sweeper: behavior already present on current main label Jun 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/agent Core agent loop, run_agent.py, prompt builder P3 Low — cosmetic, nice to have sweeper:implemented-on-main Sweeper: behavior already present on current main tool/memory Memory tool and memory providers type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants