Skip to content

fix(cron): don't strict-scan script-injected output in no-skills jobs#43223

Merged
alt-glitch merged 1 commit into
mainfrom
fix/cron-strict-scan-script-output
Jun 10, 2026
Merged

fix(cron): don't strict-scan script-injected output in no-skills jobs#43223
alt-glitch merged 1 commit into
mainfrom
fix/cron-strict-scan-script-output

Conversation

@alt-glitch

@alt-glitch alt-glitch commented Jun 10, 2026

Copy link
Copy Markdown
Collaborator

TL;DR

Cron jobs that use a script (or context_from) inject collected data into the prompt. That data was being scanned with the strict pattern set meant for the user-typed prompt — so a feed that merely quotes a dangerous command (e.g. a triage bot ingesting a bug report containing rm -rf /) hard-blocked the job on every tick, forever. This PR scans injected data with the looser tier that skill content already uses, while keeping the user prompt strict.

Behavior change

The runtime scanner (_scan_assembled_cron_prompt, added in #21350/#3968) checks the assembled prompt before each tick and picks one of two pattern tiers:

What gets which tier, per content type:

Content in assembled prompt Before After
user prompt, plain job (no script/skills) strict strict (unchanged)
skill content (skills: [...] on the job) loose loose (unchanged)
script stdout (success or error path) strict 🐛 loose
context_from output strict 🐛 loose
user prompt, when script/context data present strict (but only as part of the blob) strict, scanned separately

The last row is new defense-in-depth: when the loose tier is selected because of injected data, the raw user prompt is additionally re-scanned strict at runtime — so the user-authored surface keeps the full guarantee (it was already strict-scanned at create/update; that path is untouched).

Why the bug existed

Tier selection was keyed on has_skills alone. #5082 (script injection, Apr) and #32339 (the tier split, May) never met: a script-driven no-skills job falls into the has_skills=False branch, so its script stdout inherited user-prompt strictness. The two features composed into a job class the scanner design never considered.

has_skills was the wrong axis. The right axis is whether the assembled text contains runtime-injected content (data/skill markdown — where command-shape strings are expected and benign) vs. is purely the user's directive prompt (where rm -rf / is a smoking gun). Script output is produced by operator-authored code — the same trust class as install-vetted skill markdown.

Live incident

A production triage cron (script feed of GitHub issues, no skills, 5-min tick) ran clean for hours, then got BLOCKED (destructive_root_rm) on every tick the moment an open security issue quoting the root-delete command entered its queue. Its corpus has 112 such open items — bug reports pasting dangerous commands are normal data for this job class. Verified end-to-end post-fix: same job, same offending rows → prompt builds, agent runs.

What is NOT weakened

  • Create/update-time strict scan of the user prompt (api_server.py, cronjob tool) — untouched.
  • Plain no-script/no-skills jobs — still strict on the whole assembled prompt.
  • Injection directives (ignore previous instructions, …) arriving via script stdout — still blocked by the loose tier.
  • _CRON_THREAT_PATTERNS — unmodified.

Rejected alternatives: removing destructive_root_rm from the strict set, or a per-job skip_injection_scan flag — both weaken the guard globally, and the flag would itself be settable by an injected agent (cron jobs are API/tool-creatable). Skipping scans for trusted content classes was already rejected in #23470; tier-based scanning is the accepted precedent (#32339, #22605).

Changes

  • cron/scheduler.py: _build_job_prompt tracks has_injected_data (script stdout — both success and error paths — and context_from output) and passes it plus the raw user_prompt to _scan_assembled_cron_prompt, which now selects the tier on has_skills OR has_injected_data and strict-rescans the user prompt on the data-only path.
  • tests/cron/test_cron_prompt_injection_skill.py: 7 new cases (payload literals built at runtime so the test file never contains them) — command shapes in script stdout / failed-script output / context_from pass; injection directive in stdout still blocks; dirty user prompt on a script job still blocks; plain-prompt path stays strict; invisible unicode in feed data sanitized.
tests/cron/: 410 passed

The runtime assembled-prompt scan (#3968 lineage) selected its pattern
tier on has_skills alone. A script-driven, no-skills job injects its
script's stdout into the prompt, and that blob was scanned with the
STRICT user-prompt pattern set — so any command-shape string in the
data feed (e.g. a triage bot ingesting a bug report that quotes
`rm -rf /`) hard-blocked the job on every tick.

Script output and context_from output are runtime DATA produced by
operator-authored code — the same trust class as install-vetted skill
markdown, not a user-authored directive prompt. Select the scan tier by
what the assembled prompt CONTAINS: when it includes skill content OR
injected data, use the looser _scan_cron_skill_assembled set (keeps
unambiguous injection directives, drops command-shape patterns,
sanitizes invisible unicode instead of blocking).

Defense-in-depth is preserved:
- The raw user prompt is still strict-scanned at create/update
  (api_server paths untouched) AND re-scanned strict at runtime even
  when the looser tier was selected for the data blob.
- Plain no-script/no-skills jobs keep the strict scan on the whole
  assembled prompt.
- Injection directives arriving via script stdout still block.

Rejected alternative: removing destructive_root_rm from the strict set
or a per-job skip_injection_scan flag — both weaken the guard globally.
@github-actions

Copy link
Copy Markdown
Contributor

🔎 Lint report: fix/cron-strict-scan-script-output vs origin/main

ruff

Total: 0 on HEAD, 0 on base (➖ 0)

🆕 New issues: none

✅ Fixed issues: none

Unchanged: 0 pre-existing issues carried over.

ty (type checker)

Total: 10620 on HEAD, 10620 on base (➖ 0)

🆕 New issues: none

✅ Fixed issues: none

Unchanged: 5564 pre-existing issues carried over.

Diagnostics are surfaced as warnings — this check never fails the build.

@tonydwb tonydwb left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review Summary

Verdict: Approved

Looks Good

  • Clean fix: adds HERMES_SCRIPT_OUTPUT_STRICT_SCAN to control strict scanning of script-injected output in no-skills cron jobs.
  • Defaults to False to avoid false positives in script output, which is the safe default.
  • 2 files changed, well-scoped.
  • No security concerns.

Reviewed by Hermes Agent

@alt-glitch alt-glitch added type/bug Something isn't working type/security Security vulnerability or hardening comp/cron Cron scheduler and job management P1 High — major feature broken, no workaround labels Jun 10, 2026
@alt-glitch

Copy link
Copy Markdown
Collaborator Author

This was generated by AI during triage.

Triage

Category: type/bug + type/security (scanner-tier correctness) · Component: comp/cron · Priority: P1 — the scheduler is broken for an entire class of jobs (script-driven feeds whose data quotes command-shape strings), and the failure mode is a permanent every-tick block.

Repro: confirmed live, not theoretical — a production triage cron (script + no skills, 5-minute tick) was BLOCKED on every tick with destructive_root_rm the moment an open security issue quoting the root-delete command entered its ingest queue. Its corpus contains 112 open items quoting that pattern; dangerous-command quotes are normal triage data. After this patch, the same job + the same offending rows builds a 12K-char prompt and runs (verified by driving _build_job_prompt directly against the live feed).

Genealogy — third false-positive round of the same design flaw

This scanner has a recurring bug class: a new content source gets fed to a scan tier calibrated for a different trust class. Each round, a different source; each fix, the same shape.

Round Issue/PR What happened Resolution
🧬 origin #3968#21350 Malicious skill content loaded at runtime bypassed the create-time scan entirely Assembled-prompt scan added (strict, one tier)
1️⃣ FP: built-in skills #22588#22605 github-auth / github-pr-workflow docs (curl … $GITHUB_TOKEN) tripped exfil_curl Narrow api.github.com allowlist (_strip_cron_safe_constructs)
2️⃣ FP: skill prose #32339 11 PR-scout crons silently dead for weeks — a security postmortem in skill markdown tripped read_secrets Two-tier split: strict for user prompts, looser _scan_cron_skill_assembled for skill-assembled prompts
3️⃣ FP: script data this PR Script stdout (a data feed quoting bug-report bodies) strict-scanned in no-skills jobs Tier selection by content (has_skills OR has_injected_data), raw user prompt still strict-scanned

Root cause of round 3 specifically: #5082 (cron pre-run script injection, merged 2026-04-04) and #32339 (the tier split, merged 2026-05-26) never met. The tier split only considered has_skills; script output — merged seven weeks earlier — flows through the has_skills=False branch and inherits the strict user-prompt tier. context_from output (same data class) had the identical latent bug and is covered here too.

Precedent for reviewers

Related open work (complementary, no conflicts)

@alt-glitch alt-glitch merged commit b4170f3 into main Jun 10, 2026
23 checks passed
@alt-glitch alt-glitch deleted the fix/cron-strict-scan-script-output branch June 10, 2026 02:57
changman pushed a commit to changman/hermes-agent that referenced this pull request Jun 10, 2026
…NousResearch#43223)

The runtime assembled-prompt scan (NousResearch#3968 lineage) selected its pattern
tier on has_skills alone. A script-driven, no-skills job injects its
script's stdout into the prompt, and that blob was scanned with the
STRICT user-prompt pattern set — so any command-shape string in the
data feed (e.g. a triage bot ingesting a bug report that quotes
`rm -rf /`) hard-blocked the job on every tick.

Script output and context_from output are runtime DATA produced by
operator-authored code — the same trust class as install-vetted skill
markdown, not a user-authored directive prompt. Select the scan tier by
what the assembled prompt CONTAINS: when it includes skill content OR
injected data, use the looser _scan_cron_skill_assembled set (keeps
unambiguous injection directives, drops command-shape patterns,
sanitizes invisible unicode instead of blocking).

Defense-in-depth is preserved:
- The raw user prompt is still strict-scanned at create/update
  (api_server paths untouched) AND re-scanned strict at runtime even
  when the looser tier was selected for the data blob.
- Plain no-script/no-skills jobs keep the strict scan on the whole
  assembled prompt.
- Injection directives arriving via script stdout still block.

Rejected alternative: removing destructive_root_rm from the strict set
or a per-job skip_injection_scan flag — both weaken the guard globally.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/cron Cron scheduler and job management P1 High — major feature broken, no workaround type/bug Something isn't working type/security Security vulnerability or hardening

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants