Skip to content

test: add unit tests for 8 modules (batch 2)#62

Merged
teknium1 merged 1 commit into
NousResearch:mainfrom
0xbyt4:test/expand-coverage-2
Feb 27, 2026
Merged

test: add unit tests for 8 modules (batch 2)#62
teknium1 merged 1 commit into
NousResearch:mainfrom
0xbyt4:test/expand-coverage-2

Conversation

@0xbyt4

@0xbyt4 0xbyt4 commented Feb 26, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Add 127 new unit tests covering 8 previously untested modules
  • Modules tested: model_tools, toolset_distributions, agent/context_compressor, agent/prompt_caching, tools/cronjob_tools, tools/session_search_tool, tools/process_registry, cron/scheduler
  • Found prompt injection bypass bug in _scan_cron_prompt regex (fix PR to follow)

Test details

Module Tests What's covered
model_tools 11 Function call dispatch, agent loop interception, legacy toolset map, backward-compat wrappers
toolset_distributions 15 Distribution lookup, listing, validation, probability sampling, structure checks
agent/context_compressor 15 Threshold checks, preflight, token tracking, truncation fallback, summarization path
agent/prompt_caching 12 Cache marker application (tool/string/list content), Anthropic cache control strategy, max 4 breakpoints
tools/cronjob_tools 23 Prompt scanning (injection, exfil, unicode, deception), schedule/list/remove dispatchers
tools/session_search_tool 13 Timestamp formatting, conversation formatting, truncation around matches, search dispatcher
tools/process_registry 27 Get/poll, read log, list/filter sessions, active queries, pruning, checkpoint, kill, tool handler
cron/scheduler 5 Origin resolution (full/missing/empty)

Bug found

_scan_cron_prompt regex for prompt injection (ignore\s+(previous|all|above|prior)\s+instructions) only allows ONE word between "ignore" and "instructions". Multi-word variants like "Ignore ALL prior instructions" bypass the scanner. Fix PR to follow.

Test plan

  • All 127 new tests pass locally
  • Full suite: 299 passed, 0 failed, 9 deselected (integration)
  • No regressions in existing tests

Cover model_tools, toolset_distributions, context_compressor,
prompt_caching, cronjob_tools, session_search, process_registry,
and cron/scheduler with 127 new test cases.
@teknium1 teknium1 merged commit 3526fa2 into NousResearch:main Feb 27, 2026
sudo-yf pushed a commit to sudo-yf/hermes-agent that referenced this pull request Apr 5, 2026
docs: v0.30.1 release — CLI bridge fixes + README update
angelburgosrosado pushed a commit to angelburgosrosado/hermes-agent that referenced this pull request Apr 27, 2026
test: add unit tests for 8 modules (batch 2)
olympus-terminal pushed a commit to olympus-terminal/hermes-agent that referenced this pull request May 16, 2026
test: add unit tests for 8 modules (batch 2)
Egavasyug pushed a commit to Egavasyug/hermes-agent that referenced this pull request Jun 10, 2026
test: add unit tests for 8 modules (batch 2)
jarvis-stark-ops added a commit to 1Team-Engineering/hermes-agent that referenced this pull request Jun 10, 2026
…sResearch#62, NousResearch#64)

Adds three pre-write-txn gates in `complete_task` mirroring the
existing `_verify_created_cards` / `HallucinatedCardsError` pattern:

- `verify_runtime_floor` (closes hermes-jarvis#64) — per-role floor
  on completed_at - started_at. Build 5min, review 90s, orchestration 0.
- `verify_workspace_diff` (closes hermes-jarvis#62) — non-review
  workers on dir/worktree workspaces must produce a non-empty
  git diff against the tracking base.
- `verify_no_stray_artifacts` (closes hermes-jarvis#28) — rejects
  *evidence*, commit-hash*, triage/*, tmp-*, and untracked
  no-extension/no-shebang files (the agent-dashboard PR #1
  "all prior block evidence files" failure mode).

Opt-outs via metadata (x_fast_justified / x_no_code / x_stray_ok)
require ≥20-char string reasons and emit completion_opt_out_used
audit events with verbatim reason. Truthy bools or short strings
rejected with InvalidOptOutError.

Context: hermes-jarvis#61 (bootstrap-paradox case study).

41 tests pass; 258 wider regression — zero failures.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants