Skip to content

feat(file-sync): sync remote changes back to host on teardown#8018

Closed
alt-glitch wants to merge 3 commits into
mainfrom
feat/file-sync-back
Closed

feat(file-sync): sync remote changes back to host on teardown#8018
alt-glitch wants to merge 3 commits into
mainfrom
feat/file-sync-back

Conversation

@alt-glitch

Copy link
Copy Markdown
Collaborator

Summary

Implements Phase 2 of the bulk file sync spec (bulk-sync-spec.md). On sandbox teardown, FileSyncManager now downloads the remote .hermes/ directory, diffs against SHA-256 hashes of what was originally pushed, and applies only changed files back to the host.

Depends on #8014 (bulk upload salvage — cherry-picked as base).

Changes

Core (tools/environments/file_sync.py)

  • BulkDownloadFn callback type: Callable[[Path], None] — backend writes a tar archive to the given path
  • _pushed_hashes: SHA-256 content hashes computed on upload, stored alongside mtime+size
  • sync_back(): orchestrates download → unpack → diff → apply with:
    • Retry with exponential backoff (3 attempts, 2s/4s/8s)
    • SIGINT trap + defer (prevents partial writes on Ctrl-C)
    • fcntl.flock on .hermes/.sync.lock (serializes concurrent gateway sandboxes)
    • Last-write-wins conflict resolution with warning when host file also changed
    • New remote files pulled back via _infer_host_path prefix matching

Backends

  • SSH: _ssh_bulk_downloadtar cf - -C ~/.hermes . piped over SSH to local file
  • Modal: _modal_bulk_downloadexec tar cf -proc.stdout.read → write to file
  • Daytona: _daytona_bulk_downloadexec tar cf on remote → SDK download_file()
  • All three wire bulk_download_fn into FileSyncManager and call sync_back() at the top of cleanup() (before ControlMaster close / sandbox terminate / sandbox stop)

Tests

  • 10 FSM core tests: noop, no-changes, apply-changed, new-remote-file, conflict-warning, retry, exhausted-retries, hash-on-push, hash-cleared-on-delete, file-lock
  • 18 backend tests: SSH/Modal/Daytona download mechanics, cleanup ordering, wiring verification

72 total tests passing.

Test plan

  • 10 FSM sync-back core tests pass
  • 18 backend download + wiring tests pass
  • 44 existing bulk upload + file sync tests still pass (no regressions)
  • tar.extractall uses filter="data" (Python 3.14 compat)

🤖 Generated with Claude Code

@github-actions

Copy link
Copy Markdown
Contributor

⚠️ Supply Chain Risk Detected

This PR contains patterns commonly associated with supply chain attacks. This does not mean the PR is malicious — but these patterns require careful human review before merging.

⚠️ WARNING: exec() or eval() usage

Dynamic code execution can hide malicious behavior, especially when combined with base64 or network fetches.

Matches (first 20):

830:+        self._sandbox.process.exec(

Automated scan triggered by supply-chain-audit. If this is a false positive, a maintainer can approve after manual review.

Base automatically changed from hermes/salvage-7558 to main April 12, 2026 00:48
@alt-glitch

Copy link
Copy Markdown
Collaborator Author

@BugBot review

@alt-glitch alt-glitch changed the base branch from main to hermes/salvage-7558 April 12, 2026 00:55
Add sync_back() to FileSyncManager — on sandbox cleanup, downloads
the remote .hermes/ directory as a tar archive, diffs against SHA-256
hashes of what was originally pushed, and applies only changed files.

- SHA-256 content hashing on push for accurate change detection
- Retry with exponential backoff (3 attempts, 2s/4s/8s)
- SIGINT deferred during sync-back to prevent partial writes
- fcntl.flock serialization for concurrent gateway sandboxes
- Last-write-wins conflict resolution with logged warnings
- New files created on remote are pulled back via path inference
- Backend implementations: SSH (tar cf over pipe), Modal (exec tar
  cf, read stdout), Daytona (exec tar cf, SDK download_file)
- Wired into cleanup() for all three backends (runs before
  ControlMaster close / sandbox terminate / sandbox stop)

28 new tests (10 FSM core + 18 backend-specific), 72 total passing.
@alt-glitch alt-glitch force-pushed the feat/file-sync-back branch from 7bb3a26 to 37c478c Compare April 12, 2026 00:56
@alt-glitch alt-glitch changed the base branch from hermes/salvage-7558 to main April 12, 2026 00:56
@github-actions

Copy link
Copy Markdown
Contributor

⚠️ Supply Chain Risk Detected

This PR contains patterns commonly associated with supply chain attacks. This does not mean the PR is malicious — but these patterns require careful human review before merging.

⚠️ WARNING: exec() or eval() usage

Dynamic code execution can hide malicious behavior, especially when combined with base64 or network fetches.

Matches (first 20):

2151:+        self._sandbox.process.exec(

Automated scan triggered by supply-chain-audit. If this is a false positive, a maintainer can approve after manual review.

Comment thread tools/environments/file_sync.py
Comment thread tools/environments/file_sync.py
Comment thread tools/environments/file_sync.py Outdated
Comment thread tools/environments/file_sync.py Outdated
1. Tar paths now match _pushed_hashes keys — backends tar from /
   so entries have full absolute paths (e.g. root/.hermes/skills/f.py)
   instead of relative ./skills/f.py that never matched hash lookups
2. _infer_host_path simplified — removed broken grandparent match
   that computed garbled suffixes for new remote files
3. Lock path uses get_hermes_home() instead of Path.home() — fixes
   wrong lock path when HERMES_HOME is overridden or using profiles
4. SIGINT trap guarded by threading.current_thread() check — skips
   signal.signal() on non-main threads (gateway workers) instead of
   crashing with ValueError on every retry attempt
@github-actions

Copy link
Copy Markdown
Contributor

⚠️ Supply Chain Risk Detected

This PR contains patterns commonly associated with supply chain attacks. This does not mean the PR is malicious — but these patterns require careful human review before merging.

⚠️ WARNING: exec() or eval() usage

Dynamic code execution can hide malicious behavior, especially when combined with base64 or network fetches.

Matches (first 20):

831:+        self._sandbox.process.exec(

Automated scan triggered by supply-chain-audit. If this is a false positive, a maintainer can approve after manual review.

@alt-glitch

Copy link
Copy Markdown
Collaborator Author

@BugBot review

Comment thread tools/environments/file_sync.py
Snapshot _pushed_hashes alongside _synced_files before the try block
so both are restored atomically on failure. Previously a mid-sync
exception (e.g. host file deleted between upload and hash) would leave
_pushed_hashes partially updated while _synced_files rolled back,
causing sync_back() to make wrong change-detection decisions.
@github-actions

Copy link
Copy Markdown
Contributor

⚠️ Supply Chain Risk Detected

This PR contains patterns commonly associated with supply chain attacks. This does not mean the PR is malicious — but these patterns require careful human review before merging.

⚠️ WARNING: exec() or eval() usage

Dynamic code execution can hide malicious behavior, especially when combined with base64 or network fetches.

Matches (first 20):

831:+        self._sandbox.process.exec(

Automated scan triggered by supply-chain-audit. If this is a false positive, a maintainer can approve after manual review.

@alt-glitch

Copy link
Copy Markdown
Collaborator Author

@BugBot review

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 6aba50f. Configure here.

self._sandbox.process.exec(
f"tar cf /tmp/.hermes_sync.tar -C / {shlex.quote(rel_base)}"
)
self._sandbox.fs.download_file("/tmp/.hermes_sync.tar", str(dest))

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Daytona bulk download ignores tar command failure

Medium Severity

_daytona_bulk_download doesn't check the exit code of the process.exec tar command, unlike the SSH and Modal backends which both verify the return code and raise RuntimeError on failure. If the tar command fails (e.g., .hermes/ doesn't exist or permission denied), the code proceeds to download_file, which could download a stale /tmp/.hermes_sync.tar left over from a previous run on a persistent sandbox — silently applying outdated file contents back to the host.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 6aba50f. Configure here.

Comment thread tools/environments/ssh.py
def cleanup(self):
if self._sync_manager:
logger.info("SSH: syncing files from sandbox...")
self._sync_manager.sync_back()

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unguarded sync_back() in cleanup skips resource teardown

Medium Severity

The sync_back() call in all three backends' cleanup() methods is not wrapped in a try-except. While sync_back() catches exceptions internally in its retry loop, code paths before the loop (e.g., lock_path.parent.mkdir) and BaseException subclasses like KeyboardInterrupt (re-raised from the deferred SIGINT mechanism via os.kill) can propagate uncaught, causing the rest of cleanup to be skipped entirely — leaking SSH control sockets, Modal sandboxes, or Daytona sandboxes.

Additional Locations (2)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 6aba50f. Configure here.

@kshitijk4poor kshitijk4poor left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review — #8018: feat(file-sync): sync remote changes back to host on teardown

Verdict: Request Changes
Tests: 44 passed, 0 failed (3 skipped — pre-existing SSH-marked perf tests)

Critical

  • file_sync.py:9import fcntl is unconditional. This module does not exist on Windows and will crash at import time. The codebase already has the correct pattern in cron/scheduler.py:21 and hermes_cli/auth.py:46: try: import fcntl / except ImportError: fcntl = None with msvcrt fallback. _sync_back_locked should skip file locking (or use msvcrt.locking) when fcntl is unavailable.

Warnings

  • file_sync.py:275assert self._bulk_download_fn is not None is stripped by python -O. This is a runtime code path, not a dev-only assertion. The caller already guards (if self._bulk_download_fn is None: return), so this assert is defense-in-depth — but should be a proper if check or raise RuntimeError(...).

  • file_sync.py:273-357_resolve_host_path and _infer_host_path each call self._get_files_fn() independently, which iterates credentials+skills+cache every call. In _sync_back_impl, this is called once per remote file (potentially twice if resolve fails and infer is tried). O(n*m) where n=synced files, m=changed files. Easy fix: cache the mapping list once at the start of _sync_back_impl and pass it to both methods.

  • ssh.py, modal.py, daytona.pyBulkDownloadFn is imported but never used as a type annotation in any backend file. Dead import. (Consistent with the pre-existing BulkUploadFn pattern, so low priority — but would be nice to either use the types or remove the imports.)

  • modal.py:356"tar cf - -C / root/.hermes" hardcodes the path. SSH and Daytona dynamically construct the path from self._remote_home. Modal mirrors the existing hardcoded /root/.hermes on line 269, so it is internally consistent — but worth a comment explaining the assumption (Modal sandboxes always run as root).

Suggestions

  • file_sync.py:291_sha256_file(staged_file) is called for every extracted file including ones where pushed_hash is None (new remote files). For new files the hash comparison is always False. Minor optimization: skip hashing when pushed_hash is None.

  • daytona.py:168-170 — The remote temp file /tmp/.hermes_sync.tar is created on the sandbox but never cleaned up. If sync_back() retries or the sandbox persists, stale tar files accumulate. Consider adding rm -f /tmp/.hermes_sync.tar after download.

  • Test coverage gaps (MEDIUM): SIGINT deferral logic (main thread vs worker thread paths) is not tested. _infer_host_path edge cases (no matching prefix, partial prefix that shouldn't match) — only one positive case tested. Daytona download_file failure propagation not tested.

Looks Good

  • Clean architecture: sync_back -> retry -> SIGINT protection -> file lock -> download -> diff -> apply. Good separation of concerns across the 4 method layers.
  • _pushed_hashes lifecycle is correct: populated on upload, cleared on delete, rolled back on failure.
  • tarfile.open with filter="data" — proper security against path traversal.
  • threading.current_thread() is threading.main_thread() check before signal.signal() — correct handling for gateway worker threads.
  • Cleanup ordering is well-tested: all 3 backends verify sync_back runs before sandbox teardown (control_exit, terminate, stop).
  • Tests are well-structured with clear helpers, good coverage of the happy path and error paths (retry, exhausted retries, conflict).
  • Consistent with the existing bulk upload pattern — download is a clean mirror.

kshitijk4poor added a commit to kshitijk4poor/hermes-agent that referenced this pull request Apr 12, 2026
Salvage of PR NousResearch#8018 by @alt-glitch onto current main.

On sandbox teardown, FileSyncManager now downloads the remote .hermes/
directory, diffs against SHA-256 hashes of what was originally pushed,
and applies only changed files back to the host.

Core (tools/environments/file_sync.py):
- sync_back(): orchestrates download -> unpack -> diff -> apply with:
  - Retry with exponential backoff (3 attempts, 2s/4s/8s)
  - SIGINT trap + defer (prevents partial writes on Ctrl-C)
  - fcntl.flock serialization (concurrent gateway sandboxes)
  - Last-write-wins conflict resolution with warning
  - New remote files pulled back via _infer_host_path prefix matching

Backends:
- SSH: _ssh_bulk_download — tar cf - piped over SSH
- Modal: _modal_bulk_download — exec tar cf - -> proc.stdout.read
- Daytona: _daytona_bulk_download — exec tar cf -> SDK download_file
- All three call sync_back() at the top of cleanup()

Fixes applied during salvage (vs original PR NousResearch#8018):

| # | Issue | Fix |
|---|-------|-----|
| C1 | import fcntl unconditional — crashes Windows | try/except with fallback; _sync_back_locked skips locking when fcntl=None |
| W1 | assert for runtime guard (stripped by -O) | Replaced with proper if/raise RuntimeError |
| W2 | O(n*m) from _get_files_fn() called per file | Cache mapping once at start of _sync_back_impl, pass to resolve/infer |
| W3 | Dead BulkDownloadFn imports in 3 backends | Removed unused imports |
| W4 | Modal hardcodes root/.hermes, no explanation | Added docstring comment explaining Modal always runs as root |
| S1 | SHA-256 computed for new files where pushed_hash=None | Skip hashing when pushed_hash is None (comparison always False) |
| S2 | Daytona /tmp/.hermes_sync.tar never cleaned up | Added rm -f after download (best-effort) |

Tests: 49 passing (17 new: _infer_host_path edge cases, SIGINT
main/worker thread, Windows fcntl=None fallback, Daytona tar cleanup).

Based on NousResearch#8018 by @alt-glitch.
teknium1 pushed a commit that referenced this pull request Apr 17, 2026
Salvage of PR #8018 by @alt-glitch onto current main.

On sandbox teardown, FileSyncManager now downloads the remote .hermes/
directory, diffs against SHA-256 hashes of what was originally pushed,
and applies only changed files back to the host.

Core (tools/environments/file_sync.py):
- sync_back(): orchestrates download -> unpack -> diff -> apply with:
  - Retry with exponential backoff (3 attempts, 2s/4s/8s)
  - SIGINT trap + defer (prevents partial writes on Ctrl-C)
  - fcntl.flock serialization (concurrent gateway sandboxes)
  - Last-write-wins conflict resolution with warning
  - New remote files pulled back via _infer_host_path prefix matching

Backends:
- SSH: _ssh_bulk_download — tar cf - piped over SSH
- Modal: _modal_bulk_download — exec tar cf - -> proc.stdout.read
- Daytona: _daytona_bulk_download — exec tar cf -> SDK download_file
- All three call sync_back() at the top of cleanup()

Fixes applied during salvage (vs original PR #8018):

| # | Issue | Fix |
|---|-------|-----|
| C1 | import fcntl unconditional — crashes Windows | try/except with fallback; _sync_back_locked skips locking when fcntl=None |
| W1 | assert for runtime guard (stripped by -O) | Replaced with proper if/raise RuntimeError |
| W2 | O(n*m) from _get_files_fn() called per file | Cache mapping once at start of _sync_back_impl, pass to resolve/infer |
| W3 | Dead BulkDownloadFn imports in 3 backends | Removed unused imports |
| W4 | Modal hardcodes root/.hermes, no explanation | Added docstring comment explaining Modal always runs as root |
| S1 | SHA-256 computed for new files where pushed_hash=None | Skip hashing when pushed_hash is None (comparison always False) |
| S2 | Daytona /tmp/.hermes_sync.tar never cleaned up | Added rm -f after download (best-effort) |

Tests: 49 passing (17 new: _infer_host_path edge cases, SIGINT
main/worker thread, Windows fcntl=None fallback, Daytona tar cleanup).

Based on #8018 by @alt-glitch.
teknium1 added a commit that referenced this pull request Apr 17, 2026
Follow-ups on top of kshitijk4poor's cherry-picked salvage of PR #8018:

tools/environments/daytona.py
  - PID-suffix /tmp/.hermes_sync.<pid>.tar so concurrent sync_back calls
    against the same sandbox don't collide on the remote temp path
  - Move sync_back() inside the cleanup lock and after the _sandbox-None
    guard, with its own try/except. Previously a no-op cleanup (sandbox
    already cleared) still fired sync_back → 3-attempt retry storm against
    a nil sandbox (~6s of sleep). Now short-circuits cleanly.

tools/environments/file_sync.py
  - Add _SYNC_BACK_MAX_BYTES (2 GiB) defensive cap: refuse to extract a
    tar larger than the limit. Protects against runaway sandboxes
    producing arbitrary-size archives.
  - Add 'nothing previously pushed' guard at the top of sync_back(). If
    _pushed_hashes and _synced_files are both empty, the FileSyncManager
    was never initialized from the host side — there is nothing coherent
    to sync back. Skips the retry/backoff machinery on uninitialized
    managers and eliminates test-suite slowdown from pre-existing cleanup
    tests that don't mock the sync layer.

tests/tools/test_file_sync_back.py
  - Update _make_manager helper to seed a _pushed_hashes entry by default
    so sync_back() exercises its real path. A seed_pushed_state=False
    opt-out is available for noop-path tests.
  - Add TestSyncBackSizeCap with positive and negative coverage of the
    new cap.

tests/tools/test_sync_back_backends.py
  - Update Daytona bulk download test to assert the PID-suffixed path
    pattern instead of the fixed /tmp/.hermes_sync.tar.
teknium1 pushed a commit that referenced this pull request Apr 17, 2026
Salvage of PR #8018 by @alt-glitch onto current main.

On sandbox teardown, FileSyncManager now downloads the remote .hermes/
directory, diffs against SHA-256 hashes of what was originally pushed,
and applies only changed files back to the host.

Core (tools/environments/file_sync.py):
- sync_back(): orchestrates download -> unpack -> diff -> apply with:
  - Retry with exponential backoff (3 attempts, 2s/4s/8s)
  - SIGINT trap + defer (prevents partial writes on Ctrl-C)
  - fcntl.flock serialization (concurrent gateway sandboxes)
  - Last-write-wins conflict resolution with warning
  - New remote files pulled back via _infer_host_path prefix matching

Backends:
- SSH: _ssh_bulk_download — tar cf - piped over SSH
- Modal: _modal_bulk_download — exec tar cf - -> proc.stdout.read
- Daytona: _daytona_bulk_download — exec tar cf -> SDK download_file
- All three call sync_back() at the top of cleanup()

Fixes applied during salvage (vs original PR #8018):

| # | Issue | Fix |
|---|-------|-----|
| C1 | import fcntl unconditional — crashes Windows | try/except with fallback; _sync_back_locked skips locking when fcntl=None |
| W1 | assert for runtime guard (stripped by -O) | Replaced with proper if/raise RuntimeError |
| W2 | O(n*m) from _get_files_fn() called per file | Cache mapping once at start of _sync_back_impl, pass to resolve/infer |
| W3 | Dead BulkDownloadFn imports in 3 backends | Removed unused imports |
| W4 | Modal hardcodes root/.hermes, no explanation | Added docstring comment explaining Modal always runs as root |
| S1 | SHA-256 computed for new files where pushed_hash=None | Skip hashing when pushed_hash is None (comparison always False) |
| S2 | Daytona /tmp/.hermes_sync.tar never cleaned up | Added rm -f after download (best-effort) |

Tests: 49 passing (17 new: _infer_host_path edge cases, SIGINT
main/worker thread, Windows fcntl=None fallback, Daytona tar cleanup).

Based on #8018 by @alt-glitch.
teknium1 added a commit that referenced this pull request Apr 17, 2026
Follow-ups on top of kshitijk4poor's cherry-picked salvage of PR #8018:

tools/environments/daytona.py
  - PID-suffix /tmp/.hermes_sync.<pid>.tar so concurrent sync_back calls
    against the same sandbox don't collide on the remote temp path
  - Move sync_back() inside the cleanup lock and after the _sandbox-None
    guard, with its own try/except. Previously a no-op cleanup (sandbox
    already cleared) still fired sync_back → 3-attempt retry storm against
    a nil sandbox (~6s of sleep). Now short-circuits cleanly.

tools/environments/file_sync.py
  - Add _SYNC_BACK_MAX_BYTES (2 GiB) defensive cap: refuse to extract a
    tar larger than the limit. Protects against runaway sandboxes
    producing arbitrary-size archives.
  - Add 'nothing previously pushed' guard at the top of sync_back(). If
    _pushed_hashes and _synced_files are both empty, the FileSyncManager
    was never initialized from the host side — there is nothing coherent
    to sync back. Skips the retry/backoff machinery on uninitialized
    managers and eliminates test-suite slowdown from pre-existing cleanup
    tests that don't mock the sync layer.

tests/tools/test_file_sync_back.py
  - Update _make_manager helper to seed a _pushed_hashes entry by default
    so sync_back() exercises its real path. A seed_pushed_state=False
    opt-out is available for noop-path tests.
  - Add TestSyncBackSizeCap with positive and negative coverage of the
    new cap.

tests/tools/test_sync_back_backends.py
  - Update Daytona bulk download test to assert the PID-suffixed path
    pattern instead of the fixed /tmp/.hermes_sync.tar.
@teknium1

Copy link
Copy Markdown
Contributor

Merged via #11291. Your original implementation made it onto main through @kshitijk4poor's salvage PR (#8189) — they cherry-picked your work, addressed the first review round, and we then cherry-picked that onto current main with further hardening. Thanks for the clean architecture (retry + SIGINT defer + flock + hash diff) — it held up through two rebase rounds untouched.

@teknium1 teknium1 closed this Apr 17, 2026
ulasbilgen pushed a commit to ulasbilgen/hermes-adhd-agent that referenced this pull request May 1, 2026
Salvage of PR NousResearch#8018 by @alt-glitch onto current main.

On sandbox teardown, FileSyncManager now downloads the remote .hermes/
directory, diffs against SHA-256 hashes of what was originally pushed,
and applies only changed files back to the host.

Core (tools/environments/file_sync.py):
- sync_back(): orchestrates download -> unpack -> diff -> apply with:
  - Retry with exponential backoff (3 attempts, 2s/4s/8s)
  - SIGINT trap + defer (prevents partial writes on Ctrl-C)
  - fcntl.flock serialization (concurrent gateway sandboxes)
  - Last-write-wins conflict resolution with warning
  - New remote files pulled back via _infer_host_path prefix matching

Backends:
- SSH: _ssh_bulk_download — tar cf - piped over SSH
- Modal: _modal_bulk_download — exec tar cf - -> proc.stdout.read
- Daytona: _daytona_bulk_download — exec tar cf -> SDK download_file
- All three call sync_back() at the top of cleanup()

Fixes applied during salvage (vs original PR NousResearch#8018):

| # | Issue | Fix |
|---|-------|-----|
| C1 | import fcntl unconditional — crashes Windows | try/except with fallback; _sync_back_locked skips locking when fcntl=None |
| W1 | assert for runtime guard (stripped by -O) | Replaced with proper if/raise RuntimeError |
| W2 | O(n*m) from _get_files_fn() called per file | Cache mapping once at start of _sync_back_impl, pass to resolve/infer |
| W3 | Dead BulkDownloadFn imports in 3 backends | Removed unused imports |
| W4 | Modal hardcodes root/.hermes, no explanation | Added docstring comment explaining Modal always runs as root |
| S1 | SHA-256 computed for new files where pushed_hash=None | Skip hashing when pushed_hash is None (comparison always False) |
| S2 | Daytona /tmp/.hermes_sync.tar never cleaned up | Added rm -f after download (best-effort) |

Tests: 49 passing (17 new: _infer_host_path edge cases, SIGINT
main/worker thread, Windows fcntl=None fallback, Daytona tar cleanup).

Based on NousResearch#8018 by @alt-glitch.
ulasbilgen pushed a commit to ulasbilgen/hermes-adhd-agent that referenced this pull request May 1, 2026
Follow-ups on top of kshitijk4poor's cherry-picked salvage of PR NousResearch#8018:

tools/environments/daytona.py
  - PID-suffix /tmp/.hermes_sync.<pid>.tar so concurrent sync_back calls
    against the same sandbox don't collide on the remote temp path
  - Move sync_back() inside the cleanup lock and after the _sandbox-None
    guard, with its own try/except. Previously a no-op cleanup (sandbox
    already cleared) still fired sync_back → 3-attempt retry storm against
    a nil sandbox (~6s of sleep). Now short-circuits cleanly.

tools/environments/file_sync.py
  - Add _SYNC_BACK_MAX_BYTES (2 GiB) defensive cap: refuse to extract a
    tar larger than the limit. Protects against runaway sandboxes
    producing arbitrary-size archives.
  - Add 'nothing previously pushed' guard at the top of sync_back(). If
    _pushed_hashes and _synced_files are both empty, the FileSyncManager
    was never initialized from the host side — there is nothing coherent
    to sync back. Skips the retry/backoff machinery on uninitialized
    managers and eliminates test-suite slowdown from pre-existing cleanup
    tests that don't mock the sync layer.

tests/tools/test_file_sync_back.py
  - Update _make_manager helper to seed a _pushed_hashes entry by default
    so sync_back() exercises its real path. A seed_pushed_state=False
    opt-out is available for noop-path tests.
  - Add TestSyncBackSizeCap with positive and negative coverage of the
    new cap.

tests/tools/test_sync_back_backends.py
  - Update Daytona bulk download test to assert the PID-suffixed path
    pattern instead of the fixed /tmp/.hermes_sync.tar.
aj-nt pushed a commit to aj-nt/hermes-agent that referenced this pull request May 1, 2026
Salvage of PR NousResearch#8018 by @alt-glitch onto current main.

On sandbox teardown, FileSyncManager now downloads the remote .hermes/
directory, diffs against SHA-256 hashes of what was originally pushed,
and applies only changed files back to the host.

Core (tools/environments/file_sync.py):
- sync_back(): orchestrates download -> unpack -> diff -> apply with:
  - Retry with exponential backoff (3 attempts, 2s/4s/8s)
  - SIGINT trap + defer (prevents partial writes on Ctrl-C)
  - fcntl.flock serialization (concurrent gateway sandboxes)
  - Last-write-wins conflict resolution with warning
  - New remote files pulled back via _infer_host_path prefix matching

Backends:
- SSH: _ssh_bulk_download — tar cf - piped over SSH
- Modal: _modal_bulk_download — exec tar cf - -> proc.stdout.read
- Daytona: _daytona_bulk_download — exec tar cf -> SDK download_file
- All three call sync_back() at the top of cleanup()

Fixes applied during salvage (vs original PR NousResearch#8018):

| # | Issue | Fix |
|---|-------|-----|
| C1 | import fcntl unconditional — crashes Windows | try/except with fallback; _sync_back_locked skips locking when fcntl=None |
| W1 | assert for runtime guard (stripped by -O) | Replaced with proper if/raise RuntimeError |
| W2 | O(n*m) from _get_files_fn() called per file | Cache mapping once at start of _sync_back_impl, pass to resolve/infer |
| W3 | Dead BulkDownloadFn imports in 3 backends | Removed unused imports |
| W4 | Modal hardcodes root/.hermes, no explanation | Added docstring comment explaining Modal always runs as root |
| S1 | SHA-256 computed for new files where pushed_hash=None | Skip hashing when pushed_hash is None (comparison always False) |
| S2 | Daytona /tmp/.hermes_sync.tar never cleaned up | Added rm -f after download (best-effort) |

Tests: 49 passing (17 new: _infer_host_path edge cases, SIGINT
main/worker thread, Windows fcntl=None fallback, Daytona tar cleanup).

Based on NousResearch#8018 by @alt-glitch.
aj-nt pushed a commit to aj-nt/hermes-agent that referenced this pull request May 1, 2026
Follow-ups on top of kshitijk4poor's cherry-picked salvage of PR NousResearch#8018:

tools/environments/daytona.py
  - PID-suffix /tmp/.hermes_sync.<pid>.tar so concurrent sync_back calls
    against the same sandbox don't collide on the remote temp path
  - Move sync_back() inside the cleanup lock and after the _sandbox-None
    guard, with its own try/except. Previously a no-op cleanup (sandbox
    already cleared) still fired sync_back → 3-attempt retry storm against
    a nil sandbox (~6s of sleep). Now short-circuits cleanly.

tools/environments/file_sync.py
  - Add _SYNC_BACK_MAX_BYTES (2 GiB) defensive cap: refuse to extract a
    tar larger than the limit. Protects against runaway sandboxes
    producing arbitrary-size archives.
  - Add 'nothing previously pushed' guard at the top of sync_back(). If
    _pushed_hashes and _synced_files are both empty, the FileSyncManager
    was never initialized from the host side — there is nothing coherent
    to sync back. Skips the retry/backoff machinery on uninitialized
    managers and eliminates test-suite slowdown from pre-existing cleanup
    tests that don't mock the sync layer.

tests/tools/test_file_sync_back.py
  - Update _make_manager helper to seed a _pushed_hashes entry by default
    so sync_back() exercises its real path. A seed_pushed_state=False
    opt-out is available for noop-path tests.
  - Add TestSyncBackSizeCap with positive and negative coverage of the
    new cap.

tests/tools/test_sync_back_backends.py
  - Update Daytona bulk download test to assert the PID-suffixed path
    pattern instead of the fixed /tmp/.hermes_sync.tar.
02356abc pushed a commit to 02356abc/hermes-agent that referenced this pull request May 14, 2026
Salvage of PR NousResearch#8018 by @alt-glitch onto current main.

On sandbox teardown, FileSyncManager now downloads the remote .hermes/
directory, diffs against SHA-256 hashes of what was originally pushed,
and applies only changed files back to the host.

Core (tools/environments/file_sync.py):
- sync_back(): orchestrates download -> unpack -> diff -> apply with:
  - Retry with exponential backoff (3 attempts, 2s/4s/8s)
  - SIGINT trap + defer (prevents partial writes on Ctrl-C)
  - fcntl.flock serialization (concurrent gateway sandboxes)
  - Last-write-wins conflict resolution with warning
  - New remote files pulled back via _infer_host_path prefix matching

Backends:
- SSH: _ssh_bulk_download — tar cf - piped over SSH
- Modal: _modal_bulk_download — exec tar cf - -> proc.stdout.read
- Daytona: _daytona_bulk_download — exec tar cf -> SDK download_file
- All three call sync_back() at the top of cleanup()

Fixes applied during salvage (vs original PR NousResearch#8018):

| # | Issue | Fix |
|---|-------|-----|
| C1 | import fcntl unconditional — crashes Windows | try/except with fallback; _sync_back_locked skips locking when fcntl=None |
| W1 | assert for runtime guard (stripped by -O) | Replaced with proper if/raise RuntimeError |
| W2 | O(n*m) from _get_files_fn() called per file | Cache mapping once at start of _sync_back_impl, pass to resolve/infer |
| W3 | Dead BulkDownloadFn imports in 3 backends | Removed unused imports |
| W4 | Modal hardcodes root/.hermes, no explanation | Added docstring comment explaining Modal always runs as root |
| S1 | SHA-256 computed for new files where pushed_hash=None | Skip hashing when pushed_hash is None (comparison always False) |
| S2 | Daytona /tmp/.hermes_sync.tar never cleaned up | Added rm -f after download (best-effort) |

Tests: 49 passing (17 new: _infer_host_path edge cases, SIGINT
main/worker thread, Windows fcntl=None fallback, Daytona tar cleanup).

Based on NousResearch#8018 by @alt-glitch.
02356abc pushed a commit to 02356abc/hermes-agent that referenced this pull request May 14, 2026
Follow-ups on top of kshitijk4poor's cherry-picked salvage of PR NousResearch#8018:

tools/environments/daytona.py
  - PID-suffix /tmp/.hermes_sync.<pid>.tar so concurrent sync_back calls
    against the same sandbox don't collide on the remote temp path
  - Move sync_back() inside the cleanup lock and after the _sandbox-None
    guard, with its own try/except. Previously a no-op cleanup (sandbox
    already cleared) still fired sync_back → 3-attempt retry storm against
    a nil sandbox (~6s of sleep). Now short-circuits cleanly.

tools/environments/file_sync.py
  - Add _SYNC_BACK_MAX_BYTES (2 GiB) defensive cap: refuse to extract a
    tar larger than the limit. Protects against runaway sandboxes
    producing arbitrary-size archives.
  - Add 'nothing previously pushed' guard at the top of sync_back(). If
    _pushed_hashes and _synced_files are both empty, the FileSyncManager
    was never initialized from the host side — there is nothing coherent
    to sync back. Skips the retry/backoff machinery on uninitialized
    managers and eliminates test-suite slowdown from pre-existing cleanup
    tests that don't mock the sync layer.

tests/tools/test_file_sync_back.py
  - Update _make_manager helper to seed a _pushed_hashes entry by default
    so sync_back() exercises its real path. A seed_pushed_state=False
    opt-out is available for noop-path tests.
  - Add TestSyncBackSizeCap with positive and negative coverage of the
    new cap.

tests/tools/test_sync_back_backends.py
  - Update Daytona bulk download test to assert the PID-suffixed path
    pattern instead of the fixed /tmp/.hermes_sync.tar.
gweeteve pushed a commit to gweeteve/hermes-agent that referenced this pull request Jun 2, 2026
Salvage of PR NousResearch#8018 by @alt-glitch onto current main.

On sandbox teardown, FileSyncManager now downloads the remote .hermes/
directory, diffs against SHA-256 hashes of what was originally pushed,
and applies only changed files back to the host.

Core (tools/environments/file_sync.py):
- sync_back(): orchestrates download -> unpack -> diff -> apply with:
  - Retry with exponential backoff (3 attempts, 2s/4s/8s)
  - SIGINT trap + defer (prevents partial writes on Ctrl-C)
  - fcntl.flock serialization (concurrent gateway sandboxes)
  - Last-write-wins conflict resolution with warning
  - New remote files pulled back via _infer_host_path prefix matching

Backends:
- SSH: _ssh_bulk_download — tar cf - piped over SSH
- Modal: _modal_bulk_download — exec tar cf - -> proc.stdout.read
- Daytona: _daytona_bulk_download — exec tar cf -> SDK download_file
- All three call sync_back() at the top of cleanup()

Fixes applied during salvage (vs original PR NousResearch#8018):

| # | Issue | Fix |
|---|-------|-----|
| C1 | import fcntl unconditional — crashes Windows | try/except with fallback; _sync_back_locked skips locking when fcntl=None |
| W1 | assert for runtime guard (stripped by -O) | Replaced with proper if/raise RuntimeError |
| W2 | O(n*m) from _get_files_fn() called per file | Cache mapping once at start of _sync_back_impl, pass to resolve/infer |
| W3 | Dead BulkDownloadFn imports in 3 backends | Removed unused imports |
| W4 | Modal hardcodes root/.hermes, no explanation | Added docstring comment explaining Modal always runs as root |
| S1 | SHA-256 computed for new files where pushed_hash=None | Skip hashing when pushed_hash is None (comparison always False) |
| S2 | Daytona /tmp/.hermes_sync.tar never cleaned up | Added rm -f after download (best-effort) |

Tests: 49 passing (17 new: _infer_host_path edge cases, SIGINT
main/worker thread, Windows fcntl=None fallback, Daytona tar cleanup).

Based on NousResearch#8018 by @alt-glitch.
gweeteve pushed a commit to gweeteve/hermes-agent that referenced this pull request Jun 2, 2026
Follow-ups on top of kshitijk4poor's cherry-picked salvage of PR NousResearch#8018:

tools/environments/daytona.py
  - PID-suffix /tmp/.hermes_sync.<pid>.tar so concurrent sync_back calls
    against the same sandbox don't collide on the remote temp path
  - Move sync_back() inside the cleanup lock and after the _sandbox-None
    guard, with its own try/except. Previously a no-op cleanup (sandbox
    already cleared) still fired sync_back → 3-attempt retry storm against
    a nil sandbox (~6s of sleep). Now short-circuits cleanly.

tools/environments/file_sync.py
  - Add _SYNC_BACK_MAX_BYTES (2 GiB) defensive cap: refuse to extract a
    tar larger than the limit. Protects against runaway sandboxes
    producing arbitrary-size archives.
  - Add 'nothing previously pushed' guard at the top of sync_back(). If
    _pushed_hashes and _synced_files are both empty, the FileSyncManager
    was never initialized from the host side — there is nothing coherent
    to sync back. Skips the retry/backoff machinery on uninitialized
    managers and eliminates test-suite slowdown from pre-existing cleanup
    tests that don't mock the sync layer.

tests/tools/test_file_sync_back.py
  - Update _make_manager helper to seed a _pushed_hashes entry by default
    so sync_back() exercises its real path. A seed_pushed_state=False
    opt-out is available for noop-path tests.
  - Add TestSyncBackSizeCap with positive and negative coverage of the
    new cap.

tests/tools/test_sync_back_backends.py
  - Update Daytona bulk download test to assert the PID-suffixed path
    pattern instead of the fixed /tmp/.hermes_sync.tar.
Egavasyug pushed a commit to Egavasyug/hermes-agent that referenced this pull request Jun 10, 2026
Salvage of PR NousResearch#8018 by @alt-glitch onto current main.

On sandbox teardown, FileSyncManager now downloads the remote .hermes/
directory, diffs against SHA-256 hashes of what was originally pushed,
and applies only changed files back to the host.

Core (tools/environments/file_sync.py):
- sync_back(): orchestrates download -> unpack -> diff -> apply with:
  - Retry with exponential backoff (3 attempts, 2s/4s/8s)
  - SIGINT trap + defer (prevents partial writes on Ctrl-C)
  - fcntl.flock serialization (concurrent gateway sandboxes)
  - Last-write-wins conflict resolution with warning
  - New remote files pulled back via _infer_host_path prefix matching

Backends:
- SSH: _ssh_bulk_download — tar cf - piped over SSH
- Modal: _modal_bulk_download — exec tar cf - -> proc.stdout.read
- Daytona: _daytona_bulk_download — exec tar cf -> SDK download_file
- All three call sync_back() at the top of cleanup()

Fixes applied during salvage (vs original PR NousResearch#8018):

| # | Issue | Fix |
|---|-------|-----|
| C1 | import fcntl unconditional — crashes Windows | try/except with fallback; _sync_back_locked skips locking when fcntl=None |
| W1 | assert for runtime guard (stripped by -O) | Replaced with proper if/raise RuntimeError |
| W2 | O(n*m) from _get_files_fn() called per file | Cache mapping once at start of _sync_back_impl, pass to resolve/infer |
| W3 | Dead BulkDownloadFn imports in 3 backends | Removed unused imports |
| W4 | Modal hardcodes root/.hermes, no explanation | Added docstring comment explaining Modal always runs as root |
| S1 | SHA-256 computed for new files where pushed_hash=None | Skip hashing when pushed_hash is None (comparison always False) |
| S2 | Daytona /tmp/.hermes_sync.tar never cleaned up | Added rm -f after download (best-effort) |

Tests: 49 passing (17 new: _infer_host_path edge cases, SIGINT
main/worker thread, Windows fcntl=None fallback, Daytona tar cleanup).

Based on NousResearch#8018 by @alt-glitch.
Egavasyug pushed a commit to Egavasyug/hermes-agent that referenced this pull request Jun 10, 2026
Follow-ups on top of kshitijk4poor's cherry-picked salvage of PR NousResearch#8018:

tools/environments/daytona.py
  - PID-suffix /tmp/.hermes_sync.<pid>.tar so concurrent sync_back calls
    against the same sandbox don't collide on the remote temp path
  - Move sync_back() inside the cleanup lock and after the _sandbox-None
    guard, with its own try/except. Previously a no-op cleanup (sandbox
    already cleared) still fired sync_back → 3-attempt retry storm against
    a nil sandbox (~6s of sleep). Now short-circuits cleanly.

tools/environments/file_sync.py
  - Add _SYNC_BACK_MAX_BYTES (2 GiB) defensive cap: refuse to extract a
    tar larger than the limit. Protects against runaway sandboxes
    producing arbitrary-size archives.
  - Add 'nothing previously pushed' guard at the top of sync_back(). If
    _pushed_hashes and _synced_files are both empty, the FileSyncManager
    was never initialized from the host side — there is nothing coherent
    to sync back. Skips the retry/backoff machinery on uninitialized
    managers and eliminates test-suite slowdown from pre-existing cleanup
    tests that don't mock the sync layer.

tests/tools/test_file_sync_back.py
  - Update _make_manager helper to seed a _pushed_hashes entry by default
    so sync_back() exercises its real path. A seed_pushed_state=False
    opt-out is available for noop-path tests.
  - Add TestSyncBackSizeCap with positive and negative coverage of the
    new cap.

tests/tools/test_sync_back_backends.py
  - Update Daytona bulk download test to assert the PID-suffixed path
    pattern instead of the fixed /tmp/.hermes_sync.tar.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants