✨ feat(task): wire QStash-driven heartbeat self-rescheduling#14199
Conversation
…or heterogeneous agents Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Implements LOBE-8233: heartbeat tasks now self-arm via QStash delayed publish (or LocalScheduler setTimeout in dev). After each topic completes, TaskLifecycleService re-arms the next tick based on current DB state, with a 3-strike fuse on consecutive errors and a skip-when-urgent-brief guard. Adds /heartbeat-tick + /watchdog workflow handlers (signed) and extracts TaskRunnerService from the task.run mutation so both router and tick handler share one runner. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## canary #14199 +/- ##
==========================================
+ Coverage 67.85% 67.86% +0.01%
==========================================
Files 2227 2232 +5
Lines 191364 191574 +210
Branches 23747 23777 +30
==========================================
+ Hits 129847 130019 +172
- Misses 61388 61426 +38
Partials 129 129
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: d6132bc6c2
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
…m typing - TaskLifecycle re-arm now excludes type='error' urgent briefs from the human-waiting check; the fresh error brief from onTopicComplete was always present and stalled retries after the very first failure, making the 3-strike fuse unreachable. - TaskRunner only rolls back running→paused when *this* invocation set the running state; heartbeatTick treats CONFLICT as a graceful 'in-flight' skip so overlapping ticks don't 500 or clobber the in-flight run's status. - buildTaskPrompt now types its task arg + getReviewConfig as TaskItem (the prompts package already depends on @lobechat/types) so server TaskModel methods are assignable without parameter contravariance errors. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…nature verification Three handlers (on-topic-complete, heartbeat-tick, watchdog) duplicated the same `c.req.text() → verifyQStashSignature → 401` boilerplate. Extracted to src/server/workflows-hono/middlewares/qstashAuth.ts and mounted on the routes; handlers now just `c.req.json()` (Hono cross-converts the cached body so the middleware reading text() doesn't break json() in the handler). Note: this is for one-shot QStash webhook receivers. Upstash *Workflow* endpoints (memory-user-memory) keep using `serve()` from `@upstash/workflow/hono`, which has its own built-in verification. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…hestrator, not a renderer) Putting buildTaskPrompt under @lobechat/prompts was a layering mistake: the function does ~10 DB calls (briefs / topics / subtasks / dep identifier resolution / parent task assembly) and just maps the rows through to buildTaskRunPrompt at the end. The prompts package should stay pure rendering — buildTaskRunPrompt already lives there as the actual renderer. Moving the orchestrator back to src/server/services/taskRunner/ also lets it import model classes directly instead of structurally-typed deps, dropping the TaskPromptDeps abstraction. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
Implements LOBE-8233 — heartbeat tasks now actually heartbeat. Previously
LocalTaskSchedulerwas wired to no callers andQStashTaskSchedulerwas a TODO;onTopicCompleteran once and stopped.After this PR, every task with
automationMode='heartbeat'self-arms its next run via QStash delayed publish (or LocalSchedulersetTimeoutin dev). DB is the state authority — every tick re-reads task state and may decide to skip rather than run.Mechanism (per design doc)
TaskLifecycleService.onTopicComplete: after the existing terminal/pause logic, schedule the next tick withdelay = task.heartbeatInterval. PersisttickMessageId/scheduledAt/consecutiveFailuresundertasks.context.scheduler.*(JSONB pocket — no schema migration).errorreasons → stop re-arming and let the urgent error brief surface for human action.priority='urgent'brief blocks re-arm (covers review max-iter and fuse cases without a new schema column).QStashTaskScheduler: thin wrapper overqstashClient.publishJSON({ delay })+messages.delete(messageId).LocalTaskScheduleralready existed, now actually invoked./api/workflows/task/watchdoghandler reusesTaskModel.findStuckTasks. Cloud schedule registration is left to a one-time runbook (intentionally not auto-registered to avoid duplicateschedules.create)./on-topic-complete,/heartbeat-tick,/watchdog(skipped whenQSTASH_CURRENT_SIGNING_KEYis unset, matching existingverifyQStashSignaturebehavior).Refactors
TaskRunnerService.runTaskfrom thetask.runmutation (~180 lines → 14-line wrapper). Both router and tick handler share one runner.buildTaskPromptinto@lobechat/promptswith structurally-typed model deps so the prompts package stays free of@lobechat/databaseimports.Out of scope (deferred per design doc)
start/pause/resume/cancelextraction toTaskRunnerService.setInterval.Test plan
bunx vitest run src/server/services/taskScheduler(qstash + local, 19 tests)bunx vitest run src/server/services/taskLifecycle(10 re-arm tests covering done / error / fuse / urgent-skip / terminal-skip / non-heartbeat)bunx vitest run src/server/routers/lambda/__tests__/integration/task.integration.test.ts(19 tests still pass aftertask.runthin-wrapper refactor)bunx vitest run packages/database src/models/__tests__/{brief,task}.test.ts(77 tests, no regressions)bun run type-checkclean (only pre-existing.next/dev/typesartifact error)AGENT_RUNTIME_MODE=queue, run a heartbeat task, confirmmessageIdappears in QStash dashboard and tick fires afterheartbeatInterval.🤖 Generated with Claude Code