Skip to content

test(minimax): assert M3 stale-cache guard contract, not a brittle 1M literal#37220

Merged
teknium1 merged 1 commit into
mainfrom
fix/minimax-m3-stale-cache-test
Jun 2, 2026
Merged

test(minimax): assert M3 stale-cache guard contract, not a brittle 1M literal#37220
teknium1 merged 1 commit into
mainfrom
fix/minimax-m3-stale-cache-test

Conversation

@teknium1

@teknium1 teknium1 commented Jun 2, 2026

Copy link
Copy Markdown
Contributor

Summary

Fixes the one red shard blocking #37211 (and any other PR) — a MiniMax-M3 change-detector test that broke when an external registry value moved, unrelated to any code change.

Root cause

test_stale_m3_cache_dropped_and_reresolves_to_1m seeds a stale 204,800 cache entry, the guard correctly drops it, and resolution falls through to the live models.dev registry (nothing left to short-circuit the lookup). models.dev now reports MiniMax-M3 at 512,000, but the test hardcoded assert ctx == 1_000_000. So assert 512000 == 1000000 fails on main for everyone.

Fix

Assert the guard's actual contract — a stale ≤204,800 catch-all value for an M3 slug is dropped and re-resolved to M3's real (large) context — instead of a brittle literal. Both resolution sources satisfy it (hardcoded catalog 1,000,000; models.dev 512,000), so the test now checks ctx > 204_800. Renamed to test_stale_m3_cache_dropped_and_reresolves. Per AGENTS.md: assert the relationship, not a snapshot of data that's expected to change.

Changes

  • tests/agent/test_minimax_provider.py: one assertion + rename + comment.

Validation

Result
tests/agent/test_minimax_provider.py 47/47 pass
Failing assertion before assert 512000 == 1000000
After assert ctx > 204_800 (passes regardless of catalog-vs-registry source)

Infographic

minimax-m3-stale-cache-guard-test-fix

… literal

test_stale_m3_cache_dropped_and_reresolves_to_1m hardcoded
assert ctx == 1_000_000. The test re-resolves M3 through the live models.dev
registry (the seeded stale entry is dropped, so nothing short-circuits the
lookup), and models.dev now reports MiniMax-M3 at 512,000 — a change-detector
failure unrelated to any code change.

The guard's actual contract is: a stale <=204,800 catch-all value for an M3
slug must be DROPPED and re-resolved to M3's real (large) context. Both
sources satisfy that (hardcoded catalog 1,000,000; models.dev 512,000), so
assert the invariant (ctx > 204,800, stale value gone) instead of a literal
that external data can move. Renamed accordingly.

47/47 in test_minimax_provider.py pass.
@github-actions

github-actions Bot commented Jun 2, 2026

Copy link
Copy Markdown
Contributor

🔎 Lint report: fix/minimax-m3-stale-cache-test vs origin/main

ruff

Total: 0 on HEAD, 0 on base (➖ 0)

🆕 New issues: none

✅ Fixed issues: none

Unchanged: 0 pre-existing issues carried over.

ty (type checker)

Total: 9637 on HEAD, 9637 on base (➖ 0)

🆕 New issues: none

✅ Fixed issues: none

Unchanged: 4991 pre-existing issues carried over.

Diagnostics are surfaced as warnings — this check never fails the build.

@WenhuaXia

Copy link
Copy Markdown

This fix unblocks 14+ open PRs currently failing on test (5) because of the same flaky minimax assertion (including mine, #37212). All checks are green — would appreciate a merge so downstream PRs can turn green too. 🙏

@teknium1 teknium1 merged commit 0269eca into main Jun 2, 2026
23 checks passed
@teknium1 teknium1 deleted the fix/minimax-m3-stale-cache-test branch June 2, 2026 06:35
changman pushed a commit to changman/hermes-agent that referenced this pull request Jun 10, 2026
… literal (NousResearch#37220)

test_stale_m3_cache_dropped_and_reresolves_to_1m hardcoded
assert ctx == 1_000_000. The test re-resolves M3 through the live models.dev
registry (the seeded stale entry is dropped, so nothing short-circuits the
lookup), and models.dev now reports MiniMax-M3 at 512,000 — a change-detector
failure unrelated to any code change.

The guard's actual contract is: a stale <=204,800 catch-all value for an M3
slug must be DROPPED and re-resolved to M3's real (large) context. Both
sources satisfy that (hardcoded catalog 1,000,000; models.dev 512,000), so
assert the invariant (ctx > 204,800, stale value gone) instead of a literal
that external data can move. Renamed accordingly.

47/47 in test_minimax_provider.py pass.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants