feat(cron): parallel job execution within a single tick#9965
Closed
trueice wants to merge 1 commit into
Closed
Conversation
Currently all due jobs run serially in tick(), so a slow job (e.g. 9min data fetch) blocks everything else. This change submits jobs to a ThreadPoolExecutor (default 4 workers, configurable via HERMES_CRON_MAX_WORKERS env var). Key changes: - tick() splits into Phase 1 (advance next_run under lock) and Phase 2 (parallel execution outside lock) - os.environ session injection replaced with contextvars (thread-safe) - load_dotenv() serialized with _dotenv_lock - Backward compatible: HERMES_CRON_MAX_WORKERS=1 = serial behavior - 155 lines of new tests for parallel execution
Collaborator
|
Likely duplicate of merged PR #13021 which already implements parallel cron job execution to prevent serial tick starvation. |
Contributor
|
Thanks for the contribution @trueice! This feature has already landed on Automated hermes-sweeper review found the following evidence that this PR is superseded:
@alt-glitch also noted this duplication in their review comment. Closing as implemented. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
Currently all due jobs in
tick()run serially in a for-loop. If one job takes 9 minutes (e.g. a daily data fetch), all other due jobs are blocked until it finishes.Solution
Split
tick()into two phases:next_run_atfor all due jobsThreadPoolExecutorKey changes
HERMES_CRON_MAX_WORKERSenv var (default 4) controls max parallel jobsos.environ→contextvarsfor session/delivery injection (thread-safe, no cross-job leakage)load_dotenv()serialized with a threading lockHERMES_CRON_MAX_WORKERS=1= serial behavior (identical to current code)Files changed
cron/scheduler.pycron/jobs.pygateway/session_context.pytools/send_message_tool.pytests/cron/test_scheduler.pySafety
AIAgentinstance — no shared statesave_job_output()writes to per-job directories — no conflictsmark_job_run()uses file-level locking for jobs.json updates