Skip to content

fix: auto-invalidate stale context length cache when defaults change#1852

Closed
Tranquil-Flow wants to merge 1 commit into
NousResearch:mainfrom
Tranquil-Flow:fix/cache-staleness-invalidation
Closed

fix: auto-invalidate stale context length cache when defaults change#1852
Tranquil-Flow wants to merge 1 commit into
NousResearch:mainfrom
Tranquil-Flow:fix/cache-staleness-invalidation

Conversation

@Tranquil-Flow

@Tranquil-Flow Tranquil-Flow commented Mar 18, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Adds hash-based invalidation to the persistent context length cache (~/.hermes/context_length_cache.yaml)
  • When a new hermes-agent version changes DEFAULT_CONTEXT_LENGTHS, stale cached values are automatically discarded
  • Legacy cache files (without the hash field) are treated as stale and invalidated on first read

Problem

The context length cache stores discovered model limits across sessions with no expiration or versioning. If a hermes-agent update changes a model's default context length (e.g., a provider upgrades from 128K to 256K, or a new model is added with different defaults), existing cache entries silently override the new default. Users would be stuck on the old limit until they manually delete ~/.hermes/context_length_cache.yaml.

This affects all models and providers that go through the probing system — not just Anthropic models.

Solution

A defaults_hash field (truncated SHA-256 of DEFAULT_CONTEXT_LENGTHS) is stored alongside cached entries. On every cache read, the stored hash is compared against the current hash:

  • Match → cache is valid, entries preserved
  • Mismatch → cache is stale, all entries discarded (forces re-probe)
  • Missing → legacy file, treated as stale

The hash is recomputed from the sorted dict, so it's deterministic and changes only when actual model defaults change.

Files changed

File Change
agent/model_metadata.py _compute_defaults_hash(), updated _load_context_cache() and save_context_length()
tests/agent/test_model_metadata.py 7 new tests for hash storage, invalidation, legacy migration, determinism

Tradeoff

Changing ANY model's default invalidates ALL cached entries, including correctly-probed local models (e.g., Ollama at 32K). This is acceptable because re-probing is automatic, transparent (one API call per model), and only happens once per hermes-agent update that modifies defaults.

Related

Test plan

  • 112 tests passing (69 model_metadata + 29 context_compressor + 14 context overflow — no regressions)
  • Manual: verify legacy cache file (no defaults_hash) is invalidated on first session
  • Manual: verify cache survives across sessions when defaults haven't changed

The persistent context length cache (~/.hermes/context_length_cache.yaml)
stores discovered model context limits across sessions.  Previously,
cached values lived forever — if a new hermes-agent version changed the
default context length for a model (e.g., upgrading Claude from 200K to
1M for all users), existing cache entries would silently override the
new default, leaving users stuck on the old limit.

This adds a defaults_hash field to the cache file: a truncated SHA-256
of DEFAULT_CONTEXT_LENGTHS, recomputed on every read.  When the hash
doesn't match (new hermes-agent version with updated defaults), all
cached entries are discarded and models re-probe their actual limits.

Legacy cache files without the hash field are treated as stale and
invalidated on first read — a one-time migration with no user action
needed.

Tradeoff: updating ANY model's default invalidates ALL cached entries,
including correctly-probed local models.  This is acceptable because
re-probing is automatic and transparent (one API call per model).
@teknium1

Copy link
Copy Markdown
Contributor

Merged via PR #3802. Your cache invalidation implementation was cherry-picked onto current main with authorship preserved. Clean work — thanks!

@teknium1 teknium1 closed this Mar 30, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants