test(minimax): assert M3 stale-cache guard contract, not a brittle 1M literal#37220
Merged
Conversation
… literal test_stale_m3_cache_dropped_and_reresolves_to_1m hardcoded assert ctx == 1_000_000. The test re-resolves M3 through the live models.dev registry (the seeded stale entry is dropped, so nothing short-circuits the lookup), and models.dev now reports MiniMax-M3 at 512,000 — a change-detector failure unrelated to any code change. The guard's actual contract is: a stale <=204,800 catch-all value for an M3 slug must be DROPPED and re-resolved to M3's real (large) context. Both sources satisfy that (hardcoded catalog 1,000,000; models.dev 512,000), so assert the invariant (ctx > 204,800, stale value gone) instead of a literal that external data can move. Renamed accordingly. 47/47 in test_minimax_provider.py pass.
Contributor
🔎 Lint report:
|
|
This fix unblocks 14+ open PRs currently failing on |
changman
pushed a commit
to changman/hermes-agent
that referenced
this pull request
Jun 10, 2026
… literal (NousResearch#37220) test_stale_m3_cache_dropped_and_reresolves_to_1m hardcoded assert ctx == 1_000_000. The test re-resolves M3 through the live models.dev registry (the seeded stale entry is dropped, so nothing short-circuits the lookup), and models.dev now reports MiniMax-M3 at 512,000 — a change-detector failure unrelated to any code change. The guard's actual contract is: a stale <=204,800 catch-all value for an M3 slug must be DROPPED and re-resolved to M3's real (large) context. Both sources satisfy that (hardcoded catalog 1,000,000; models.dev 512,000), so assert the invariant (ctx > 204,800, stale value gone) instead of a literal that external data can move. Renamed accordingly. 47/47 in test_minimax_provider.py pass.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes the one red shard blocking #37211 (and any other PR) — a MiniMax-M3 change-detector test that broke when an external registry value moved, unrelated to any code change.
Root cause
test_stale_m3_cache_dropped_and_reresolves_to_1mseeds a stale 204,800 cache entry, the guard correctly drops it, and resolution falls through to the live models.dev registry (nothing left to short-circuit the lookup). models.dev now reports MiniMax-M3 at 512,000, but the test hardcodedassert ctx == 1_000_000. Soassert 512000 == 1000000fails onmainfor everyone.Fix
Assert the guard's actual contract — a stale ≤204,800 catch-all value for an M3 slug is dropped and re-resolved to M3's real (large) context — instead of a brittle literal. Both resolution sources satisfy it (hardcoded catalog 1,000,000; models.dev 512,000), so the test now checks
ctx > 204_800. Renamed totest_stale_m3_cache_dropped_and_reresolves. Per AGENTS.md: assert the relationship, not a snapshot of data that's expected to change.Changes
tests/agent/test_minimax_provider.py: one assertion + rename + comment.Validation
tests/agent/test_minimax_provider.pyassert 512000 == 1000000assert ctx > 204_800(passes regardless of catalog-vs-registry source)Infographic