Skip to content

fix(hindsight): prevent late retain sync during interpreter shutdown#15507

Closed
Sanjays2402 wants to merge 1 commit into
NousResearch:mainfrom
Sanjays2402:fix/hindsight-shutdown-race
Closed

fix(hindsight): prevent late retain sync during interpreter shutdown#15507
Sanjays2402 wants to merge 1 commit into
NousResearch:mainfrom
Sanjays2402:fix/hindsight-shutdown-race

Conversation

@Sanjays2402

Copy link
Copy Markdown
Contributor

Summary

The Hindsight memory provider can submit async retain work after shutdown begins, causing RuntimeError: cannot schedule new futures after interpreter shutdown during Python teardown.

Root Cause

shutdown() joins existing background threads and closes the client, but there's no lifecycle gate preventing sync_turn() from submitting new asyncio.run_coroutine_threadsafe() calls after shutdown starts. A late sync_turn() call races against the interpreter's cleanup of the event loop machinery.

Fix

1. Lifecycle state gating

  • Added _shutting_down, _closed flags and _state_lock
  • sync_turn() returns early once _shutting_down is set
  • _run_sync() accepts allow_shutdown=True parameter — normal callers are blocked after shutdown starts, but the shutdown path itself can still flush/close

2. Extracted _submit_retain() helper

  • Retain submission logic extracted from sync_turn() so both normal sync and shutdown flush use the same code path
  • Tracks _last_retained_turn to avoid duplicating a full-session retain during shutdown flush

3. Shutdown final flush

  • Before closing the client, shutdown() checks for un-retained pending turns and flushes them via _submit_retain(allow_shutdown=True)
  • This ensures recent conversation turns aren't lost on shutdown

Changes

  • plugins/memory/hindsight/__init__.py — lifecycle state, _submit_retain(), guarded sync_turn(), enhanced shutdown()

Fixes #15497

@Sanjays2402

Copy link
Copy Markdown
Contributor Author

Closing — superseded by main. The hindsight provider was refactored to use a queue-backed writer thread with a sentinel-based shutdown (_retain_queue, _WRITER_SENTINEL, Event-typed _shutting_down), which solves the original issue more robustly than this PR's approach.

Verified by attempting a rebase against current main: every conflict zone shows main already implements equivalent or stronger guards (_shutting_down.set() before draining + bounded join + sentinel-based writer exit). Re-running my PR's intent through that machinery produces no net diff.

Closing rather than dropping the rebase commit since nothing remains for me to land. Thanks!

@Sanjays2402 Sanjays2402 closed this May 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/plugins Plugin system and bundled plugins P2 Medium — degraded but workaround exists type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Hindsight provider can submit retain work during interpreter shutdown

2 participants