fix: wake runtime for blocked WorkItem rechecks#1375
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
There was a problem hiding this comment.
Pull request overview
This PR fixes a runtime scheduling bug where blocked WorkItems with a recheck_at deadline could fail to wake the run loop after the deadline, leaving work stuck until external input arrives. It adds a storage query for the next unconsumed blocked recheck deadline, uses that to bound idle sleeping, and adds a regression test asserting the runtime wakes and emits a recheck tick.
Changes:
- Add
AppStorage::next_blocked_work_item_recheck_atto find the earliest unconsumedrecheck_atfor blocked Open WorkItems for an agent. - Update the runtime idle loop to sleep until the next blocked recheck deadline (or until a notify), and project
sleeping_untilinto agent state. - Add a regression test ensuring the runtime emits a
work_item_rechecktick and consumes the recheck without external input.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| src/storage.rs | Adds lookup for earliest unconsumed blocked WorkItem recheck_at deadline. |
| src/runtime/memory_refresh.rs | Exposes the storage lookup via RuntimeHandle for use by the run loop. |
| src/runtime.rs | Bounds idle waiting by the next recheck deadline using tokio::select!(notify, sleep). |
| src/runtime/scheduler_executor.rs | Extends idle→sleep transition to accept/projection of sleeping_until. |
| src/runtime/tests/work_items.rs | Adds regression test for deadline-based wake + recheck consumption. |
| src/runtime/tests/runtime_state.rs | Updates test callsite for the new idle→sleep transition signature. |
Comments suppressed due to low confidence (1)
src/runtime/scheduler_executor.rs:236
transition_run_loop_idle_to_sleepreturnsNonewhen the agent is alreadyAsleep, sosleeping_untilwill not be updated in the persistedAgentStateeven if a new earlier/later blocked WorkItem recheck deadline is discovered after a wake (e.g.,work_item_*APIs callnotify_one()without changing status toAwakeIdle). Consider allowing this transition to refreshsleeping_untilwhile already asleep (or add a dedicated ‘update sleeping_until’ path) so operator/UI state reflects the durable wake target.
let mut guard = self.runtime.inner.agent.lock().await;
if matches!(
guard.state.status,
AgentStatus::Asleep | AgentStatus::Stopped
) || !guard.queue.is_empty()
{
return Ok(None);
}
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Holon Run Report
|
|
Blocking review note (cannot submit a formal Request Changes review on an own-authored PR): Thanks for the fix and for addressing the zero-duration overflow case. Current-head CI is green, but I found one remaining contract issue before merge.
Because this PR explicitly adds idle sleep projection for the next blocked recheck deadline, please update the already-asleep path to refresh |
|
Addressed the final blocking note in 646e497. |
|
Fixed the CI regression in the new idle sleep projection. The run loop now preserves an existing timed Sleep deadline when re-projecting an already-asleep agent without a work-item recheck deadline, so session-local sleep wake tasks do not become stale. Added |
Summary
recheck_atdeadline instead of sleeping indefinitely until external inputCloses #1356
Verification
cargo fmt --all -- --checkcargo test recheck -- --nocaptureRUSTFLAGS="-D warnings" cargo check --all-targets