Bug Description
There is a PR addressing this already here, but I could not find an issue. #3122
Minimax's "highspeed" variant is not a small, cheap model like Flash or Haiku. It is the same model, running on better hardware for twice the TPS at twice the price.
https://platform.minimax.io/docs/guides/pricing-paygo
On their developer subscription "Token Plan", it is only guaranteed on the more expensive plans. On other plans it may work off-peak but is not always available, which I suspect is the cause for the 500 errors on #3122.
#3122 has a partial fix for this. I wanted to bring some visibility to this if possible because people using API plans maybe paying more than they expect.
Steps to Reproduce
Configure only the minimax provider, and no others. Do anything that uses auxiliary models. Review your API usage or invoices.
Expected Behavior
Minimax doesn't have an equivalent flash model, it should default to MiniMax-M2.7
Actual Behavior
Burns tokens at twice the standard billing rate.
Affected Component
Agent Core (conversation loop, context compression, memory)
Messaging Platform (if gateway-related)
No response
Operating System
Debian 6.12.74-2
Python Version
3.11.15
Hermes Version
Hermes Agent v0.5.0 (2026.3.28)
Relevant Logs / Traceback
Root Cause Analysis (optional)
The bug is line 59 & 60 in the auxilliary_client.py (on current main). #3122 is only a partial fix and it adds other stuff that may not make any sense, I think the root cause is only the model issue, but it does silently break session summaries which they also tried to improve.
Proposed Fix (optional)
Replace MiniMax-M2.7-highspeed with MiniMax-M2.7 on both lines (for both China and International users).
Are you willing to submit a PR for this?
Bug Description
There is a PR addressing this already here, but I could not find an issue. #3122
Minimax's "highspeed" variant is not a small, cheap model like Flash or Haiku. It is the same model, running on better hardware for twice the TPS at twice the price.
https://platform.minimax.io/docs/guides/pricing-paygo
On their developer subscription "Token Plan", it is only guaranteed on the more expensive plans. On other plans it may work off-peak but is not always available, which I suspect is the cause for the 500 errors on #3122.
#3122 has a partial fix for this. I wanted to bring some visibility to this if possible because people using API plans maybe paying more than they expect.
Steps to Reproduce
Configure only the minimax provider, and no others. Do anything that uses auxiliary models. Review your API usage or invoices.
Expected Behavior
Minimax doesn't have an equivalent flash model, it should default to MiniMax-M2.7
Actual Behavior
Burns tokens at twice the standard billing rate.
Affected Component
Agent Core (conversation loop, context compression, memory)
Messaging Platform (if gateway-related)
No response
Operating System
Debian 6.12.74-2
Python Version
3.11.15
Hermes Version
Hermes Agent v0.5.0 (2026.3.28)
Relevant Logs / Traceback
Root Cause Analysis (optional)
The bug is line 59 & 60 in the auxilliary_client.py (on current main). #3122 is only a partial fix and it adds other stuff that may not make any sense, I think the root cause is only the model issue, but it does silently break session summaries which they also tried to improve.
Proposed Fix (optional)
Replace MiniMax-M2.7-highspeed with MiniMax-M2.7 on both lines (for both China and International users).
Are you willing to submit a PR for this?