Skip to content

fix: defer scratch workspace cleanup when task has active children (#33774)#33916

Closed
annguyenNous wants to merge 1 commit into
NousResearch:mainfrom
annguyenNous:fix/kanban-workspace-gc
Closed

fix: defer scratch workspace cleanup when task has active children (#33774)#33916
annguyenNous wants to merge 1 commit into
NousResearch:mainfrom
annguyenNous:fix/kanban-workspace-gc

Conversation

@annguyenNous

Copy link
Copy Markdown
Contributor

Problem

When a Kanban task (workspace_kind=scratch) completes, _cleanup_workspace() immediately deletes the workspace via shutil.rmtree(). If the task has children linked via task_links (created with parents=[A]), those children start and find the workspace already deleted — FileNotFoundError or silent data loss.

Fix

Two changes to _cleanup_workspace() in hermes_cli/kanban_db.py:

  1. Deferred cleanup: Before deleting, queries task_links + tasks to check if any children are still active (status NOT IN done/archived/failed/cancelled). If so, logs a debug message and returns without deleting.

  2. Parent cleanup trigger (_try_cleanup_parent_workspaces): After a child task's workspace is cleaned up, checks if the child's parents now have all children terminal. If so, cleans up the parent's deferred scratch workspace.

Changes

  • hermes_cli/kanban_db.py: +61 lines
    • 17 lines in _cleanup_workspace() (active children check)
    • 44 lines new function _try_cleanup_parent_workspaces()

Design

  • All operations are best-effort (wrapped in try/except: pass) — cleanup never blocks task completion.
  • Uses existing task_links table and task status — no schema changes.
  • The deferred cleanup runs when the last child completes, so the workspace lifecycle now spans the full parent→children chain.

Fixes #33774

…ousResearch#33774)

When a Kanban task with workspace_kind=scratch completes, the
_cleanup_workspace() function immediately deletes the workspace
directory. If the task has children linked via task_links, those
children find the workspace deleted when they start.

This fix adds two checks:
1. Before deleting, check if any children are still active
   (todo/ready/running). If so, defer cleanup.
2. After a child completes, check if parent workspace can now
   be cleaned up (all children terminal).

Fixes NousResearch#33774
@alt-glitch alt-glitch added type/bug Something isn't working P3 Low — cosmetic, nice to have comp/plugins Plugin system and bundled plugins labels May 28, 2026
@alt-glitch

Copy link
Copy Markdown
Collaborator

Competing with closed PRs #29738 and #31266 (same fix family — defer scratch cleanup until children finish). Fixes #33774. Different author from the prior attempts.

@Stefan4D

Copy link
Copy Markdown

Given the potential for data loss, particularly with the auto decomposer defaulting to 'scratch', I think there is an argument to upgrade this from a P3?

@nirolfa

nirolfa commented Jun 1, 2026

Copy link
Copy Markdown

Given the potential for data loss, particularly with the auto decomposer defaulting to 'scratch', I think there is an argument to upgrade this from a P3?
--
yeah would be nice to upgrade from P3; it's quite annoying

@teknium1

teknium1 commented Jun 7, 2026

Copy link
Copy Markdown
Contributor

Merged via #41352. Your commit was cherry-picked onto current main with your authorship preserved in git log (commit 9405cd0). Added a follow-up fix (a non-scratch dir/worktree child now also sweeps its deferred scratch parent — it was returning early before) plus 3 regression tests. Thanks for the fix!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/plugins Plugin system and bundled plugins P3 Low — cosmetic, nice to have type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: Kanban parent-child handoff: scratch workspace GC destroys artifacts before child can read them

5 participants