feat: Deleting scheduled tasks also clears tasks in the queue.#64110
feat: Deleting scheduled tasks also clears tasks in the queue.#64110Lutz2015 wants to merge 6 commits intoopenclaw:mainfrom
Conversation
Greptile SummaryThis PR adds a
Confidence Score: 2/5Not safe to merge — the fix does not achieve its stated goal and includes commented-out dead code. Two P1 logic findings: the jobExists check fires after execution and delivery has already completed, so it does not prevent post-deletion Feishu pushes as described; the timer-path check is redundant with existing handling in applyOutcomeToStoredJob. Commented-out dead code is also present. Both src/cron/service/timer.ts and src/cron/service/ops.ts need attention — the jobExists call placement and the commented-out return block in timer.ts are the primary concerns. Prompt To Fix All With AIThis is a comment left during a code review.
Path: src/cron/service/timer.ts
Line: 730-736
Comment:
**Commented-out dead code should be removed**
The original `return` statement was commented out rather than deleted. This is the exact same value produced by the live `return` on line 744 when `jobExists` is true — leaving it here adds noise and misleads future readers into thinking there's an intentional alternate path.
```suggestion
if (!(await jobExists(state, id))) {
```
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: src/cron/service/timer.ts
Line: 737-743
Comment:
**Post-execution check does not prevent Feishu delivery**
`jobExists` is called after `executeJobCoreWithTimeout` has already returned, meaning the job body (including any Feishu message send) has already completed. Returning `status: "skipped"` here only suppresses the post-run state update; it does not cancel or drop the message.
Additionally, this check is redundant for the timed path: `applyOutcomeToStoredJob` (called at line 788) already handles deleted jobs — it does `jobs.find(...)` after a force-reload, logs "job not found after forceReload, result discarded", and returns without persisting anything. The new `jobExists` call simply duplicates that guard while adding an extra lock acquisition and forced disk read on every job completion.
If the goal is to prevent delivery for jobs whose task was deleted *before they start*, the check should be inserted before `executeJobCoreWithTimeout` (inside `runDueJob`, after acquiring the lock to inspect whether the job is still present in the store).
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: src/cron/service/ops.ts
Line: 626-632
Comment:
**Same post-execution race: delivery already occurred**
Like the timer path, `jobExists` is checked only after `executeJobCoreWithTimeout` returns — the Feishu message (or whatever the job delivered) has already been sent. Overwriting `coreResult` with `{ status: "skipped" }` prevents the result from being persisted to state, but it does not undo the delivery.
Note that the locked block immediately below (line 643) already guards against this case: if the job was deleted, `state.store?.jobs.find(...)` returns `undefined` and the function returns early, so no double-write would occur even without the new check. The meaningful protection this change adds is suppressing the `tryFinishManualTaskRun` bookkeeping call on line 637 with a `"skipped"` status rather than the real outcome — the actual delivery behaviour is unchanged.
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: src/cron/service/timer.ts
Line: 103-108
Comment:
**`forceReload: true` on every job completion adds unnecessary I/O**
`jobExists` always passes `forceReload: true`, which reads the store from disk on every call. It is invoked once per job completion on both the timed and manual paths. For installations with high-frequency cron jobs this accumulates into constant disk I/O even when no deletion is occurring. Consider checking the in-memory store first and only reloading if the job is not found (or accepting a small consistency window for this best-effort check).
How can I resolve this? If you propose a fix, please make it concise.Reviews (1): Last reviewed commit: "feat: add jobExists" | Re-trigger Greptile |
| if (!(await jobExists(state, id))) { | ||
| state.deps.log.info( | ||
| { jobId: id, jobName: job.name }, | ||
| "cron: dropping completed run for job removed during execution", | ||
| ); | ||
| return { jobId: id, taskRunId, status: "skipped", startedAt, endedAt: state.deps.nowMs() }; | ||
| } |
There was a problem hiding this comment.
Post-execution check does not prevent Feishu delivery
jobExists is called after executeJobCoreWithTimeout has already returned, meaning the job body (including any Feishu message send) has already completed. Returning status: "skipped" here only suppresses the post-run state update; it does not cancel or drop the message.
Additionally, this check is redundant for the timed path: applyOutcomeToStoredJob (called at line 788) already handles deleted jobs — it does jobs.find(...) after a force-reload, logs "job not found after forceReload, result discarded", and returns without persisting anything. The new jobExists call simply duplicates that guard while adding an extra lock acquisition and forced disk read on every job completion.
If the goal is to prevent delivery for jobs whose task was deleted before they start, the check should be inserted before executeJobCoreWithTimeout (inside runDueJob, after acquiring the lock to inspect whether the job is still present in the store).
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/cron/service/timer.ts
Line: 737-743
Comment:
**Post-execution check does not prevent Feishu delivery**
`jobExists` is called after `executeJobCoreWithTimeout` has already returned, meaning the job body (including any Feishu message send) has already completed. Returning `status: "skipped"` here only suppresses the post-run state update; it does not cancel or drop the message.
Additionally, this check is redundant for the timed path: `applyOutcomeToStoredJob` (called at line 788) already handles deleted jobs — it does `jobs.find(...)` after a force-reload, logs "job not found after forceReload, result discarded", and returns without persisting anything. The new `jobExists` call simply duplicates that guard while adding an extra lock acquisition and forced disk read on every job completion.
If the goal is to prevent delivery for jobs whose task was deleted *before they start*, the check should be inserted before `executeJobCoreWithTimeout` (inside `runDueJob`, after acquiring the lock to inspect whether the job is still present in the store).
How can I resolve this? If you propose a fix, please make it concise.There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 867caa689e
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| // startedAt, | ||
| // endedAt: state.deps.nowMs(), | ||
| // }; | ||
| if (!(await jobExists(state, id))) { |
There was a problem hiding this comment.
Guard deleted jobs before executing due runs
The deletion check happens only after executeJobCoreWithTimeout has finished, so a job removed while waiting in the timer worker queue still runs and can emit side effects (for example, executeMainSessionCronJob enqueues the outbound system event during execution). In that scenario this branch relabels the run as skipped, but the message/work already happened, so deleting a scheduled task does not actually stop pending queued executions.
Useful? React with 👍 / 👎.
| coreResult = await executeJobCoreWithTimeout(state, executionJob); | ||
| if (await jobExists(state, jobId).then((exists) => !exists)) { |
There was a problem hiding this comment.
Guard manual runs before executing deleted jobs
This manual-run path has the same ordering issue: it executes executeJobCoreWithTimeout first and only then checks jobExists. If a manual run is deleted during execution, the side effects still occur and then the result is overwritten to skipped, which means deletion does not prevent in-flight queued/manual delivery and the recorded outcome no longer reflects what actually ran.
Useful? React with 👍 / 👎.
I think this PR is very useful. Here's why |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 2bb4547b9d
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| // Pre-execution check: if the job was removed between the due-job | ||
| // collection and now (e.g. by a concurrent `remove()` call), skip | ||
| // execution entirely so no delivery occurs. | ||
| if (!(await jobExists(state, id))) { |
There was a problem hiding this comment.
Handle job-existence reload errors in timer workers
runDueJob now calls jobExists before entering the try/catch, but jobExists force-reloads the cron store and can throw on transient read/parse failures (for example during concurrent external file rewrites). In that case the worker rejects, Promise.all(workers) aborts the whole tick, and outcomes from other jobs in the same tick may never be applied, leaving stale running/active state until a later recovery. Treat existence-check failures as per-job error outcomes instead of letting them fail the entire timer pass.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 6d856fc01f
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| // Pre-execution check: if the job was removed between the due-job | ||
| // collection and now (e.g. by a concurrent `remove()` call), skip | ||
| // execution entirely so no delivery occurs. | ||
| if (!(await jobExists(state, id))) { |
There was a problem hiding this comment.
Catch existence-check failures inside runDueJob
runDueJob performs await jobExists(...) before entering its try/catch, so any transient store read/parse error from the forced reload rejects the worker promise immediately. Because workers are aggregated with Promise.all, one rejection aborts the tick and skips the outcome-application block, leaving due jobs with persisted runningAtMs markers from the collection phase but without corresponding completion updates; this can strand jobs in a pseudo-running state and prevent future executions until later repair. Fresh evidence here is that the new pre-execution guard introduced at this call site is now the only uncaught path in runDueJob before worker aggregation.
Useful? React with 👍 / 👎.
|
Codex review: needs changes before merge. What this changes: The PR adds a cron Required change before merge: The remaining blockers are narrow and mechanical: handle Security review: Security review cleared: Security review cleared: the diff is limited to cron service implementation and tests, with no concrete security or supply-chain regression found. Review findings:
Review detailsBest possible solution: Land a revised cron-service fix that checks deletion after reservation but before delivery side effects for both timer and manual queued runs, treats store reload failures as per-job outcomes, preserves existing post-execution removal semantics, adds focused regression coverage, and includes the required changelog entry. Do we have a high-confidence way to reproduce the issue? Yes. A high-confidence source-level reproduction is to collect or queue a cron job, remove its stored job before the worker starts, and observe that current main still executes the captured Is this the best way to solve the issue? No. The PR’s direction is the narrow maintainable fix, but the timer guard must be moved inside Full review comments:
Overall correctness: patch is incorrect Acceptance criteria:
What I checked:
Likely related people:
Remaining risk / open question:
Codex review notes: model gpt-5.5, reasoning high; reviewed against a256745323ab. |
Summary
Change Type (select all)
Scope (select all touched areas)
Linked Issue/PR
Root Cause (if applicable)
Regression Test Plan (if applicable)
For bug fixes or regressions, name the smallest reliable test coverage that should catch this. Otherwise write
N/A.User-visible / Behavior Changes
Security Impact (required)
Repro + Verification
Environment
Steps
Expected
Actual
Evidence
Attach at least one:
Human Verification (required)
Verified scenarios:
Edge cases checked:
-Deleting a task with no queued executions.
-Deleting a task with queued but not yet consumed executions.
What you did not verify:
Review Conversations
If a bot review conversation is addressed by this PR, resolve that conversation yourself. Do not leave bot review conversation cleanup for maintainers.
Compatibility / Migration
Risks and Mitigations
List only real risks for this PR. Add/remove entries as needed. If none, write
None.Risks and Mitigations
Mitigation: Restrict deletion cleanup to queued executions explicitly associated with the deleted scheduled task ID.
Mitigation: Best-effort queue cleanup plus validation before delivery/execution if task existence is checked downstream.