Skip to content

fix(cron): bound the desktop run-history query to one job#41088

Merged
teknium1 merged 1 commit into
mainfrom
hermes/hermes-4fd7587d
Jun 7, 2026
Merged

fix(cron): bound the desktop run-history query to one job#41088
teknium1 merged 1 commit into
mainfrom
hermes/hermes-4fd7587d

Conversation

@teknium1

@teknium1 teknium1 commented Jun 7, 2026

Copy link
Copy Markdown
Contributor

Summary

The desktop cron run-history load no longer times out on large cron histories — its query is now bounded to one job instead of scanning the whole cron pile.

Root cause: the run-history endpoint (GET /api/cron/jobs/{id}/runs, added in #40684) reused list_sessions_rich's order_by_last_active path with a leading-wildcard id_query. That routes through the recursive compression-chain CTE, which seeds from every source='cron' row in the DB and runs per-row preview/last_active subqueries before filtering to one job and applying LIMIT. Cost scaled with total cron history, so a heavy pile timed out before eventually populating. Cron runs are flat, never-compressed sessions (cron_{job_id}_{ts}), so the chain machinery was pure overhead and the job binding is a true prefix, not a substring.

Changes

  • hermes_state.py: new SessionDB.list_cron_job_runs() — bounded [prefix, hi) id-range scan on source='cron', ORDER BY started_at DESC, same preview/last_active enrichment. No CTE, no %...% LIKE.
  • hermes_state.py: add idx_sessions_source_id(source, id) so the range is an index scan; bump SCHEMA_VERSION 14 → 15 (index reconciles onto existing DBs via CREATE INDEX IF NOT EXISTS on startup).
  • hermes_cli/web_server.py: point the endpoint at the new method.
  • tests/test_hermes_state.py: TestListCronJobRuns — scoping, newest-first, enrichment, substring-collision exclusion, non-cron exclusion, paging, index-range query plan.

Validation

Real SessionDB, 30k cron rows, target job ~1k runs:

Before (chain CTE + LIKE) After (prefix range scan)
run-history query ~85 ms ~5 ms
scales with total cron history requested window
query plan full cron seed + correlated subqueries SEARCH s USING INDEX idx_sessions_source_id

Targeted suite: tests/test_hermes_state.py (24 passed) + tests/hermes_cli/test_web_server.py -k cron (5 passed).

Follow-up still open from #40684: a core cron: retention policy (keep-last-N / age-out) to cap the on-disk pile. Out of scope here — this PR only fixes the read-path timeout.

Infographic

cron-run-history-bound-to-one-job

The cron run-history endpoint (GET /api/cron/jobs/{id}/runs, added in
#40684) reused list_sessions_rich's order_by_last_active path with a
leading-wildcard id_query. That routes through the recursive
compression-chain CTE, which seeds from EVERY source='cron' row in the DB
and runs per-row preview/last_active subqueries before filtering to one
job and applying LIMIT. Work scaled with the total cron history, so a
large pile made the run-history load time out before eventually
populating.

Cron runs are flat, never-compressed sessions with ids of the form
cron_{job_id}_{ts}, so the chain machinery is pure overhead and the
job binding is a true prefix, not a substring.

- New SessionDB.list_cron_job_runs(): bounded [prefix, hi) id-range scan
  on source='cron', ordered by started_at DESC, with the same
  preview/last_active enrichment. No CTE, no leading-wildcard LIKE.
- Add idx_sessions_source(source, id) so the range is an index scan;
  bump SCHEMA_VERSION 14 -> 15 (index reconciles onto existing DBs via
  CREATE INDEX IF NOT EXISTS on startup).
- Point the endpoint at the new method.

Measured on a real SessionDB with 30k cron rows: 5ms vs 85ms for the old
path (16x), and the new path stays flat as the pile grows while the old
one scaled with it. Verified the query plan uses idx_sessions_source_id
(range scan, no full table scan), runs are correctly scoped (substring
collisions like cron_xalpha_ excluded), newest-first, and paged.
@teknium1 teknium1 force-pushed the hermes/hermes-4fd7587d branch from 3d40081 to 287ce94 Compare June 7, 2026 08:54
@github-actions

github-actions Bot commented Jun 7, 2026

Copy link
Copy Markdown
Contributor

🔎 Lint report: hermes/hermes-4fd7587d vs origin/main

ruff

Total: 0 on HEAD, 0 on base (➖ 0)

🆕 New issues: none

✅ Fixed issues: none

Unchanged: 0 pre-existing issues carried over.

ty (type checker)

Total: 9986 on HEAD, 9985 on base (🆕 +1)

🆕 New issues: none

✅ Fixed issues: none

Unchanged: 5178 pre-existing issues carried over.

Diagnostics are surfaced as warnings — this check never fails the build.

@alt-glitch alt-glitch added type/perf Performance improvement or optimization P3 Low — cosmetic, nice to have comp/cli CLI entry point, hermes_cli/, setup wizard comp/cron Cron scheduler and job management labels Jun 7, 2026
@teknium1 teknium1 merged commit ed81cfe into main Jun 7, 2026
23 checks passed
@teknium1 teknium1 deleted the hermes/hermes-4fd7587d branch June 7, 2026 09:41
agogo233 added a commit to agogo233/hermes-agent that referenced this pull request Jun 8, 2026
* upstream/main: (430 commits)
  fix(yuanbao): bound ws.close() so an idle server can't stall shutdown ~5s (NousResearch#40607)
  docs: add Urdu translation of README (NousResearch#40578)
  fix(hindsight): send only new-turn delta on append retains instead of whole session (NousResearch#40605)
  feat(gateway): render terminal tool calls as native bash code blocks on markdown platforms (NousResearch#41215)
  feat(desktop): stop the chat viewport from following streaming output (NousResearch#41414)
  chore(release): map AlchemistChaos co-author email for NousResearch#40135 salvage
  fix(desktop): recover chat after sleep/wake by revalidating a stale remote backend
  fix(web): make _has_env config-aware so SEARXNG_URL auto-detect honors Hermes config
  fix(web): honor Hermes config-aware SEARXNG_URL lookup
  install.sh: hint at root-owned npm cache when desktop npm install fails (NousResearch#39688)
  fix(tools): percent-encode non-ascii URL components
  fix(skills): browse shows full catalog, not first 5000 (NousResearch#41413)
  feat(desktop+gateway): remote media relay — attach images/PDFs and display gateway images over the network
  feat(desktop): full tool-backend config (pickers + per-backend settings) in Settings (NousResearch#41232)
  hardening(api-server): scan cron prompts on REST create/update for parity with the agent tool
  fix: skip MCP preflight content-type probe on reconnect when already ready (NousResearch#40604)
  fix(kanban): sweep deferred scratch parent on non-scratch child completion + tests
  fix: defer scratch workspace cleanup when task has active children (NousResearch#33774)
  feat(onboarding): opt-in structured profile-build path on first contact (NousResearch#41114)
  feat(compression): temporal anchoring in compaction summaries (NousResearch#41102)
  test(discord): align clarify/model-picker tests with fail-closed component auth (NousResearch#41338)
  chore(release): map Dusk1e and LaPhilosophie for approval fail-closed salvage (NousResearch#33844, NousResearch#33866, NousResearch#30964)
  fix(discord): fail closed for component button auth when no allowlist set
  fix(feishu): fail closed for update prompt card actions
  fix(slack): re-check gateway auth on approval and slash-confirm buttons
  fix: guard int(os.getenv()) casts against malformed env vars (NousResearch#40598)
  fix: respect Honcho env var fallback in doctor and honcho status
  chore(release): add synapsesx to AUTHOR_MAP for NousResearch#40495 salvage
  fix(research): keep tool_call/tool_response pairs intact when compressing trajectories
  fix(simplex): accept display name in SIMPLEX_ALLOWED_USERS
  fix(desktop): make the running-turn timer per-session (NousResearch#41182)
  test(approval): regression for shell-escape denylist bypass (NousResearch#36846, NousResearch#36847)
  fix(security): strip shell escapes in denylist normalizer; fail-closed on missing approval module
  fix(stream+output-cap): guard empty streams and parse OpenRouter output-cap errors (NousResearch#40589)
  fix(desktop): bootstrap falls back to installed agent install.sh on GitHub 404
  feat(dashboard): change UI font from the theme picker, independent of theme (NousResearch#41145)
  fix(cli): return bool (not None) when a destructive-slash confirmation is cancelled (NousResearch#40583)
  fix(desktop): preserve configured base_url on same-provider model switch (NousResearch#41121)
  fix(desktop): stop bare-URL autolinker swallowing trailing emphasis asterisks (NousResearch#41093)
  fix(cron): bound the desktop run-history query to one job (NousResearch#41088)
  fix(desktop): scope in-session /model switch per-session, stop process-env leak (NousResearch#41120)
  chore: map bmoore210 author email for PR NousResearch#40550 salvage
  fix(desktop): scope session list to active profile + longer timeout
  fix: harden gateway startup and turn persistence
  fix(computer_use): honor custom vision routing
  fix(aux): honor model.default_headers on auxiliary client too (NousResearch#40033)
  fix(agent): honor model.default_headers for custom OpenAI-compatible providers (NousResearch#40033)
  docs(i18n): port deep-audit corrections to zh-Hans mirror (NousResearch#41104)
  fix(compression): don't overwrite the -1 post-compression sentinel in preflight seed (NousResearch#36718)
  chore(release): map singhsanidhya741@gmail.com to sanidhyasin (NousResearch#41094)
  ...
changman pushed a commit to changman/hermes-agent that referenced this pull request Jun 10, 2026
…ch#41088)

The cron run-history endpoint (GET /api/cron/jobs/{id}/runs, added in
NousResearch#40684) reused list_sessions_rich's order_by_last_active path with a
leading-wildcard id_query. That routes through the recursive
compression-chain CTE, which seeds from EVERY source='cron' row in the DB
and runs per-row preview/last_active subqueries before filtering to one
job and applying LIMIT. Work scaled with the total cron history, so a
large pile made the run-history load time out before eventually
populating.

Cron runs are flat, never-compressed sessions with ids of the form
cron_{job_id}_{ts}, so the chain machinery is pure overhead and the
job binding is a true prefix, not a substring.

- New SessionDB.list_cron_job_runs(): bounded [prefix, hi) id-range scan
  on source='cron', ordered by started_at DESC, with the same
  preview/last_active enrichment. No CTE, no leading-wildcard LIKE.
- Add idx_sessions_source(source, id) so the range is an index scan;
  bump SCHEMA_VERSION 14 -> 15 (index reconciles onto existing DBs via
  CREATE INDEX IF NOT EXISTS on startup).
- Point the endpoint at the new method.

Measured on a real SessionDB with 30k cron rows: 5ms vs 85ms for the old
path (16x), and the new path stays flat as the pile grows while the old
one scaled with it. Verified the query plan uses idx_sessions_source_id
(range scan, no full table scan), runs are correctly scoped (substring
collisions like cron_xalpha_ excluded), newest-first, and paged.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/cli CLI entry point, hermes_cli/, setup wizard comp/cron Cron scheduler and job management P3 Low — cosmetic, nice to have type/perf Performance improvement or optimization

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants