Skip to content

fix(plugins): restrict disk-cleanup cron-output classification to output/ subtree (#32164)#32478

Closed
Bartok9 wants to merge 2 commits into
NousResearch:mainfrom
Bartok9:fix/32164-disk-cleanup-cron-jobs-json
Closed

fix(plugins): restrict disk-cleanup cron-output classification to output/ subtree (#32164)#32478
Bartok9 wants to merge 2 commits into
NousResearch:mainfrom
Bartok9:fix/32164-disk-cleanup-cron-jobs-json

Conversation

@Bartok9

@Bartok9 Bartok9 commented May 26, 2026

Copy link
Copy Markdown
Contributor

Summary

Fixes #32164disk-cleanup misclassifies top-level cron state files (e.g. jobs.json, .tick.lock) as cron-output, making them eligible for automatic deletion. Deleting jobs.json silently empties the scheduler registry.

Root Cause

guess_category() in plugins/disk-cleanup/disk_cleanup.py returned "cron-output" for any path under HERMES_HOME/cron/**:

if top == "cron" or top == "cronjobs":
    return "cron-output"

This is too broad. Only run artifacts under cron/output/** are disposable. The top-level control-plane files (jobs.json, .tick.lock) must never be auto-tracked as cleanup candidates.

Fix

Restrict cron-output classification to the output subtree:

if top == "cron" or top == "cronjobs":
    if len(rel.parts) >= 2 and rel.parts[1] == "output":
        return "cron-output"
    return None

Tests

Updated tests/plugins/test_disk_cleanup_plugin.py:

  • Renamed test_cron_subtree_categorisedtest_cron_output_subtree_categorised to test the correct path (cron/output/<job>/run.mdcron-output)
  • Added test_cron_jobs_json_protected_from_cleanup: cron/jobs.jsonNone
  • Added test_cron_tick_lock_protected_from_cleanup: cron/.tick.lockNone
  • Added test_cron_top_level_file_protected: any top-level cron/ file → None

All 41 disk_cleanup tests pass.

Bartok9 added 2 commits May 26, 2026 03:35
… add TELEGRAM_ALLOW_BOTS (NousResearch#32188)

The Telegram adapter never populated source.is_bot when building a
MessageEvent, so _is_user_authorized()'s bot-filter path (which checks
getattr(source, 'is_bot', False)) always saw False for Telegram senders.
A second gateway running on the same machine would therefore accept and
respond to messages sent by the first bot, causing echo/duplicate replies.

Two changes:

1. gateway/platforms/telegram.py — pass is_bot=bool(getattr(user, 'is_bot', False))
   to build_source() inside _build_message_event(). python-telegram-bot exposes
   user.is_bot accurately; the getattr guard handles edge cases where from_user
   is a legacy mock or an extended type without the attribute.

2. gateway/run.py — add Platform.TELEGRAM: 'TELEGRAM_ALLOW_BOTS' to
   platform_allow_bots_map so operators can opt Telegram bots in with
   TELEGRAM_ALLOW_BOTS=all|mentions (same UX as DISCORD_ALLOW_BOTS and
   FEISHU_ALLOW_BOTS).

Tests added (tests/gateway/test_telegram_bot_filter.py):
- SessionSource.is_bot field propagation
- _is_user_authorized blocks bot Telegram source by default
- _is_user_authorized blocks when TELEGRAM_ALLOW_BOTS=none
- _is_user_authorized admits when TELEGRAM_ALLOW_BOTS=all
- _is_user_authorized admits when TELEGRAM_ALLOW_BOTS=mentions
- Case-insensitive env value matching
- TELEGRAM_ALLOW_BOTS present in platform_allow_bots_map

Fixes NousResearch#32188
…put subtree (NousResearch#32164)

guess_category() in disk_cleanup.py classified any path under
HERMES_HOME/cron/** as 'cron-output', making top-level control-plane
files like cron/jobs.json and cron/.tick.lock eligible for automatic
deletion.  Deleting jobs.json silently empties the scheduler registry.

Fix: restrict the cron-output classification to paths whose second
component is 'output' (i.e., cron/output/**).  Top-level cron files
return None so they are never auto-tracked as cleanup candidates.

Tests updated/added:
- test_cron_output_subtree_categorised: cron/output/<job>/run.md -> cron-output
- test_cron_jobs_json_protected_from_cleanup: cron/jobs.json -> None
- test_cron_tick_lock_protected_from_cleanup: cron/.tick.lock -> None
- test_cron_top_level_file_protected: any other cron/ top-level file -> None

All 41 disk_cleanup tests pass.

Fixes NousResearch#32164
@teknium1

Copy link
Copy Markdown
Contributor

Closing in favor of #34840 (merged). It uses @sweetcornna's #33834 for the same disk-cleanup classification fix — clean (disk-cleanup only) and with regression tests. Your PR bundled an unrelated Telegram bot-filter change; if you'd like that landed, please open it as a separate PR. Thanks for catching #32164!
#34840

@teknium1 teknium1 closed this May 29, 2026
@alt-glitch alt-glitch added type/bug Something isn't working P1 High — major feature broken, no workaround comp/plugins Plugin system and bundled plugins comp/cron Cron scheduler and job management labels May 29, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/cron Cron scheduler and job management comp/plugins Plugin system and bundled plugins P1 High — major feature broken, no workaround type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

disk-cleanup misclassifies cron/jobs.json as cron-output and can delete the live cron registry

3 participants