fix: unify gateway session hygiene with agent compression config by teknium1 · Pull Request #739 · NousResearch/hermes-agent

teknium1 · 2026-03-09T03:08:18Z

Problem

The gateway had a separate compression system ('session hygiene') with hardcoded thresholds that were completely disconnected from the model's context length and the user's compression config in config.yaml:

auto_compress_tokens: 100,000 (hardcoded)
auto_compress_messages: 200 (hardcoded)

This caused premature auto-compression on Telegram/Discord:

~60k tokens: The 200-message threshold was hit (tool-heavy sessions easily reach 200 messages even with low token counts)
~220k tokens: The agent's own internal compressor finally triggered at the correct threshold
Neither matched the expected 85% of claude-opus-4.6's 200k context = 170k tokens

CLI sessions worked correctly because they only use the agent's internal ContextCompressor, which properly reads compression.threshold from config.

Fix

Unified the gateway hygiene to use the exact same config as the agent:

Reads model name from config.yaml → uses get_model_context_length() for context limit
Reads compression.threshold from config.yaml (default 0.85)
Respects compression.enabled and env var overrides (CONTEXT_COMPRESSION_THRESHOLD, CONTEXT_COMPRESSION_ENABLED)
Removed the message-count-based trigger (redundant, caused false positives)
Removed the undocumented session_hygiene config section
Warn threshold is now 95% of model context (was hardcoded 200k)

Result for claude-opus-4.6: Gateway hygiene now triggers at 170k tokens (85% of 200k) instead of the old 100k/200-messages.

Test plan

Updated tests/gateway/test_session_hygiene.py — 13 tests covering:
- Model-aware threshold scaling (128k, 200k, 1M models)
- Custom threshold percentages
- Message count alone no longer triggers compression
- Warn threshold at 95% of context
Full suite: 2468 passed, 5 skipped

Adds a simple config option to play the terminal bell (\a) when the agent finishes a response. Useful for long-running tasks — switch to another window and your terminal will ding when done. Works over SSH since the bell character propagates through the connection. Most terminal emulators can be configured to flash the taskbar, play a sound, or show a visual indicator on bell. Config (default: off): display: bell_on_complete: true Closes #318

The gateway had a SEPARATE compression system ('session hygiene') with hardcoded thresholds (100k tokens / 200 messages) that were completely disconnected from the model's context length and the user's compression config in config.yaml. This caused premature auto-compression on Telegram/Discord — triggering at ~60k tokens (from the 200-message threshold) or inconsistent token counts. Changes: - Gateway hygiene now reads model name from config.yaml and uses get_model_context_length() to derive the actual context limit - Compression threshold comes from compression.threshold in config.yaml (default 0.85), same as the agent's ContextCompressor - Removed the message-count-based trigger (was redundant and caused false positives in tool-heavy sessions) - Removed the undocumented session_hygiene config section — the standard compression.* config now controls everything - Env var overrides (CONTEXT_COMPRESSION_THRESHOLD, CONTEXT_COMPRESSION_ENABLED) are respected - Warn threshold is now 95% of model context (was hardcoded 200k) - Updated tests to verify model-aware thresholds, scaling across models, and that message count alone no longer triggers compression For claude-opus-4.6 (200k context) at 85% threshold: gateway hygiene now triggers at 170k tokens instead of the old 100k.

Major updates to reflect the current OBLITERATUS codebase: - Change default recommendation from 'informed' (experimental) to 'advanced' (reliable, well-tested multi-direction SVD) - Add new CLI commands: tourney, recommend, strategies, report, aggregate, abliterate (alias) - Add --direction-method flag (diff_means, svd, leace) - Add strategies module (embedding/FFN ablation, head pruning, layer removal) - Add evaluation module with LM Eval Harness integration - Expand analysis modules from 15 to 28 - Add Apple Silicon (MLX) support - Add study presets (quick, jailbreak, knowledge, etc.) - Add --contribute, --verify-sample-size, --preset flags - Add complete CLI command reference table - Fix torch property name: total_mem -> total_memory (caught during live testing) Tested: Successfully abliterated Qwen2.5-0.5B-Instruct using 'advanced' method — refusal rate 0.4%, coherence 1.0, model responds without refusal to test prompts.

Added pitfalls discovered during live abliteration testing: - Models < 1B have fragmented refusal, respond poorly (0.5B: 60%→20%) - Models 3B+ work much better (3B: 75%→0% with advanced defaults) - aggressive method can backfire on small models (made it worse) - Spectral certification RED is common even when refusal rate is 0% - Fixed torch property: total_mem → total_memory

OBLITERATUS skill (PR NousResearch#408 updated): - 9 CLI methods, 28 analysis modules, 116 model presets - Default method: advanced (multi-direction SVD, norm-preserving) - Live-tested: Qwen2.5-3B 75%→0% refusal, Qwen2.5-0.5B 60%→20% - References, templates, and real-world pitfalls included Gateway compression fix (PR NousResearch#739): - Unified session hygiene with agent compression config - Uses model context length × compression.threshold from config.yaml - Removed hardcoded 100k/200-msg thresholds

teknium1 added 4 commits March 8, 2026 19:41

teknium1 merged commit c21d77c into main Mar 9, 2026
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: unify gateway session hygiene with agent compression config#739

fix: unify gateway session hygiene with agent compression config#739
teknium1 merged 4 commits into
mainfrom
hermes/hermes-0cbb57e2

teknium1 commented Mar 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

teknium1 commented Mar 9, 2026

Problem

Fix

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant