Skip to content

feat(minimax): add MiniMax provider with M3 as default#3165

Closed
octo-patch wants to merge 5 commits into
QwenLM:mainfrom
octo-patch:feature/add-minimax-provider
Closed

feat(minimax): add MiniMax provider with M3 as default#3165
octo-patch wants to merge 5 commits into
QwenLM:mainfrom
octo-patch:feature/add-minimax-provider

Conversation

@octo-patch

@octo-patch octo-patch commented Apr 12, 2026

Copy link
Copy Markdown

Summary

This PR adds MiniMax as an OpenAI-compatible provider for Qwen Code, with MiniMax-M3 wired up as the default model.

Changes

  • New provider: MiniMaxOpenAICompatibleProvider extending DefaultOpenAICompatibleProvider
    • Default base URL applied in buildClient() override (matches sibling provider pattern)
    • Supports api.minimaxi.com as an alternative (domestic mirror)
    • URL detection uses new URL(baseUrl).hostname to prevent substring false-positives (fixes CodeQL warnings)
    • Temperature: guards against null per SDK types (number | null | undefined); clamps values above 1.0 down to 1.0 with a debug log
    • response_format: removed via typed destructuring — no any cast or delete
  • Token limits (tokenLimits.ts and vscode-ide-companion/.../tokenLimits.ts):
    • MiniMax-M3: 512K input, 128K output
    • MiniMax-M2.7 / M2.7-highspeed fallback: 200K input, 64K output
  • Modality: MiniMax-M3 enables image input; M2.7 stays text-only
  • Documentation: MiniMax section in docs/users/configuration/model-providers.md lists M3 as the default plus M2.7 / M2.7-highspeed for users who want the previous generation
  • Tests: 15 unit tests including null-temperature, hostname-spoofing, M3 input/output limits, and the M3 image-modality default

Supported Models

Model ID Description
MiniMax-M3 (default) Latest model — 512K context, up to 128K output, supports image input
MiniMax-M2.7 Peak Performance, Ultimate Value — previous generation
MiniMax-M2.7-highspeed Same performance, faster and more agile

API References

- Add MiniMaxOpenAICompatibleProvider extending DefaultOpenAICompatibleProvider
- Use OpenAI-compatible interface with https://api.minimax.io/v1 as default base URL
- Enforce temperature in (0.0, 1.0] range; remove unsupported response_format
- Register provider detection in determineProvider() factory
- Add token limits for MiniMax-M2.7 (200K input, 64K output)
- Add unit tests for all provider behaviors
Comment thread packages/core/src/core/openaiContentGenerator/provider/minimax.ts Fixed
Comment thread packages/core/src/core/openaiContentGenerator/provider/minimax.ts Fixed
*/
override buildRequest(
request: OpenAI.Chat.ChatCompletionCreateParams,
userPromptId: string,

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Suggestion] buildRequest uses any type and delete instead of typed destructuring.

The method casts to any and uses delete to remove response_format. This disables type checking and conflicts with the project's strict/noImplicitAny conventions.

Suggested change
userPromptId: string,
const { response_format: _, ...rest } = baseRequest;
const temperature = (rest.temperature === 0 || rest.temperature == null)
? 1.0
: rest.temperature;
return { ...rest, temperature } as OpenAI.Chat.ChatCompletionCreateParams;

— qwen3.6-plus via Qwen Code /review

contentGeneratorConfig: ContentGeneratorConfig,
cliConfig: Config,
) {
super(contentGeneratorConfig, cliConfig);

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Suggestion] Constructor mutates this.contentGeneratorConfig after the super() call.

Sibling providers (DashScope, DeepSeek) leave the config unchanged and apply defaults in buildClient() via destructuring fallback. If any code holds a reference to the original config object, it won't see the updated baseUrl.

Consider applying the default URL in buildClient() or via a protected getter, matching sibling patterns.

— qwen3.6-plus via Qwen Code /review

const result: any = { ...baseRequest };

// MiniMax does not support temperature = 0; default to 1.0
if (result.temperature === 0 || result.temperature === undefined) {

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Suggestion] Temperature check doesn't guard against null.

The OpenAI SDK types temperature as number | null | undefined. If null is passed, it bypasses both guards and gets sent to the API.

Suggested change
if (result.temperature === 0 || result.temperature === undefined) {
if (result.temperature == null || result.temperature === 0) {

— qwen3.6-plus via Qwen Code /review

* Checks if the configuration targets the MiniMax API.
*/
static isMiniMaxProvider(config: ContentGeneratorConfig): boolean {
const baseUrl = config.baseUrl ?? '';

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Nice to have] isMiniMaxProvider uses bare substring matching (baseUrl.includes(...)).

A URL like https://proxy.example.com?target=api.minimax.io.malicious.com would false-positive. Consider using new URL(baseUrl).hostname with a more precise check. Very low probability in practice, but worth noting for correctness.

— qwen3.6-plus via Qwen Code /review

- Fix URL detection to use hostname comparison (prevents substring
  false positives flagged by CodeQL)
- Move default base URL to buildClient() override — avoids mutating
  contentGeneratorConfig in the constructor (matches sibling pattern)
- Replace any cast + delete with typed destructuring for response_format
- Guard temperature against null (number | null | undefined per SDK types)
- Clarify token-limit comment: regex covers M2.7-highspeed too
- Add MiniMax section to model-providers.md with config examples
  for both MiniMax-M2.7 and MiniMax-M2.7-highspeed
@wenshao

wenshao commented Apr 17, 2026

Copy link
Copy Markdown
Collaborator

Thanks for the contribution! Overall the implementation is solid, follows the existing provider conventions, and has good test coverage. The strict hostname matching and typed destructuring for response_format removal are both more robust than the patterns used by sibling providers.

Mergeable, with minor suggestions

1. Redundant input pattern in tokenLimits.ts:139

[/^minimax-m2\.7/i, LIMITS['200k']], // new
[/^minimax-m2\.5/i, LIMITS['192k']],
[/^minimax-/i, LIMITS['200k']],      // fallback already returns 200K

The new m2.7 input entry produces the same result as the fallback, so it doesn't change behavior. The m2.7 entry in OUTPUT_PATTERNS (64K) is necessary, but the input entry can be dropped to reduce noise — or kept as explicit documentation. Either is fine.

2. Silent temperature rewriting — UX and consistency

In minimax.ts:

const temperature =
  rest.temperature == null || rest.temperature === 0
    ? 1.0
    : rest.temperature;

Two small issues:

  • When a user explicitly sets temperature: 0, it gets silently rewritten to 1.0. This can lead to "I set it to 0, why are the outputs still random?" confusion. Consider adding a one-time debug log in the rewrite branch.
  • temperature > 1.0 is not clamped and will fail at the API with a 400. Inconsistent behavior (0 → silent rewrite; 1.5 → API error). Pick one strategy: either clamp both into (0, 1], or let both pass through.

Architecture reference: how OpenCode does it

For reference, the official OpenCode configuration uses MiniMax's Anthropic-compatible endpoint https://api.minimax.io/anthropic/v1 (reusing @ai-sdk/anthropic), which sidesteps the temperature / response_format quirks of the OpenAI-compatible layer entirely.

This path isn't applicable to qwen-code today (all providers extend DefaultOpenAICompatibleProvider — there's no Anthropic branch), so it doesn't block this PR. But it does confirm that the temperature / response_format constraints are specific to MiniMax's OpenAI-compatible surface — the direction this PR takes is correct.

Risk assessment

  • Backward compatibility: existing MiniMax-M2.5 users will now route through the new provider. temperature=0 changes from an API error to silent 1.0, and response_format gets stripped. Both previously failed at the API, so this is effectively fixing a bug rather than a regression.
  • Security: hostname validation is correct, and the spoofing test cases are thorough.

…t pattern

- buildRequest now clamps temperature > 1.0 down to 1.0 (was previously
  passed through to API and would have failed). Adds debug logs via the
  project DebugLogger when an explicit user value is rewritten so the
  adjustment is observable.
- Removed redundant /^minimax-m2.7/ entry in tokenLimits.ts; the existing
  /^minimax-/ fallback already maps M2.7 / M2.7-highspeed to 200K.

Addresses review suggestions on QwenLM#3165.
@octo-patch

Copy link
Copy Markdown
Author

Thanks for the detailed review @wenshao! Addressed both points in 9417be7:

1. Redundant input pattern in tokenLimits.ts

Removed the explicit [/^minimax-m2\.7/i, LIMITS['200k']] entry. The existing [/^minimax-/i, LIMITS['200k']] fallback already covers MiniMax-M2.7 and MiniMax-M2.7-highspeed with the same 200K limit, so the dedicated line was dead code. Comment on the fallback line now mentions which models route through it.

2. Temperature > 1.0 was passed through to the API

Good catch — the previous code only normalized 0/null/undefined and let any positive value through, including 1.5, which the MiniMax API rejects. Now buildRequest clamps anything above 1.0 down to 1.0, and emits a DebugLogger line (channel MINIMAX) whenever the user's explicit value was rewritten (either 0 → 1.0 or > 1.0 → 1.0). The silent null/undefined → 1.0 default-fill is left unlogged to avoid noise.

Added two tests covering the new clamping behavior. All 17 minimax tests + 54 tokenLimits tests pass locally.

@wenshao

wenshao commented Apr 18, 2026

Copy link
Copy Markdown
Collaborator

@octo-patch Hi, CI is failing on this PR — all 9 Test jobs (macOS/Ubuntu/Windows × Node 20/22/24) are broken.

Failure:

FAIL  packages/core/src/core/coreToolScheduler.test.ts > CoreToolScheduler validation retry loop detection
TypeError: this.toolRegistry.ensureTool is not a function

This was already fixed on main in #3415 (updated the ToolRegistry mock in coreToolScheduler.test.ts to include the new ensureTool() method). Please rebase/merge the latest main and re-run CI. Thanks!

- Add MiniMax-M3 to provider docs as the default (512K context, up to 128K
  output, supports image input). Keep M2.7 / M2.7-highspeed entries for
  users who want the previous generation.
- Token limits: add /^minimax-m3/i input pattern at 512K and output pattern at
  128K. M2.7 fallback unchanged at 200K input / 64K output.
- Modality: M3 enables image input; everything else under the /^minimax-/
  fallback stays text-only.
- VSCode IDE companion token-limits table: same M3 entries for consistency.
- Drop unused /^minimax-m2\.5/ pattern (no longer surfaced) along with its
  stale "MiniMax-M2.5 context window" comment on the shared 192k LIMITS slot.
- Tests: replace the M2.5 / M2.1 cases with M3 (latest) and M2.7 /
  M2.7-highspeed (fallback) input/output assertions, plus a modality test
  verifying M3 returns image:true while M2.7 stays text-only.
@octo-patch octo-patch changed the title feat: add MiniMax provider support feat(minimax): add MiniMax provider with M3 as default Jun 7, 2026
// MiniMax accepts temperature only in (0.0, 1.0]; rewrite invalid values.
const original = rest.temperature;
let temperature: number;
if (original == null || original === 0) {

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Critical] Temperature guard has a gap on the lower bound: negative values and NaN pass through all three checks (null/0/>1.0) and reach the else branch, which forwards them unchanged. The JSDoc and comment both state the valid range is (0.0, 1.0], but -0.5 and NaN bypass enforcement and will cause a MiniMax API 400 error.

Suggested change
if (original == null || original === 0) {
// MiniMax accepts temperature only in (0.0, 1.0]; rewrite invalid values.
const original = rest.temperature;
let temperature: number;
if (original == null || !(original > 0)) {
temperature = 1.0;
if (original === 0) {
// Only log when the user explicitly set 0; null/undefined is a default fill.
debugLogger.debug(
`temperature=0 is not supported; using 1.0 instead (request ${userPromptId})`,
);
}
} else if (original > 1.0) {
temperature = 1.0;
debugLogger.debug(
`temperature=${original} exceeds the 1.0 max; clamping to 1.0 (request ${userPromptId})`,
);
} else {
temperature = original;
}

The !(original > 0) check catches both negative numbers and NaN in a single comparison (since NaN > 0 is false).

— qwen3.7-max via Qwen Code /review

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 92278c4. Replaced original == null || original === 0 with original == null || !(original > 0) per your suggestion, so negative values and NaN are also rewritten to 1.0. Added tests for temperature: -0.5 and temperature: Number.NaN that pin the behavior.

// MiniMax
[/^minimax-m2\.5/i, LIMITS['64k']],
[/^minimax-m3/i, LIMITS['128k']], // MiniMax-M3: 128K output
[/^minimax-m2\.7/i, LIMITS['64k']], // MiniMax-M2.7 / M2.7-highspeed: 64K output

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Suggestion] OUTPUT_PATTERNS has no MiniMax fallback entry, while PATTERNS (input) retains [/^minimax-/i, LIMITS['200k']] as a fallback. This creates two issues:

  1. M2.5 regression: the old [/^minimax-m2\.5/i, LIMITS['64k']] was removed; existing M2.5 users now fall through to the default 32K output.
  2. Future variants: any new MiniMax model (M4, M3-turbo, etc.) added to PATTERNS and MODALITY_PATTERNS but not OUTPUT_PATTERNS will silently get 32K instead of a reasonable default.

Additionally, hasExplicitOutputLimit() returns false for unmatched variants, which changes applyOutputTokenLimit behavior — user-configured max_tokens passes through uncapped, risking 400 errors when input + max_output > contextWindowSize.

Suggested change
[/^minimax-m2\.7/i, LIMITS['64k']], // MiniMax-M2.7 / M2.7-highspeed: 64K output
// MiniMax
[/^minimax-m3/i, LIMITS['128k']], // MiniMax-M3: 128K output
[/^minimax-m2\.7/i, LIMITS['64k']], // MiniMax-M2.7 / M2.7-highspeed: 64K output
[/^minimax-/i, LIMITS['64k']], // MiniMax fallback: 64K output

Apply the same fallback to packages/vscode-ide-companion/src/utils/tokenLimits.ts as well.

— qwen3.7-max via Qwen Code /review

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 92278c4. Added [/^minimax-m2\.5/i, 64K] and a [/^minimax-/i, 64K] fallback to both packages/core/src/core/tokenLimits.ts and packages/vscode-ide-companion/src/utils/tokenLimits.ts, so M2.5 keeps its 64K output limit and future MiniMax variants get a sensible default. New test covers the fallback path for unknown MiniMax variants.

return false;
}
}

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Suggestion] The catch {} block silently swallows URL parse errors. If a user configures a malformed MiniMax base URL (e.g., api.minimax.io/v1 without a scheme), isMiniMaxProvider returns false and the request falls through to DefaultOpenAICompatibleProvider, which sends parameters MiniMax doesn't accept (like temperature: 0 or response_format). The resulting API error gives no indication that provider mis-detection occurred.

Suggested change
} catch (e) {
debugLogger.debug(`isMiniMaxProvider: failed to parse baseUrl "${baseUrl}": ${e}`);
return false;
}

A single debug log at the detection boundary would make misconfiguration immediately visible.

— qwen3.7-max via Qwen Code /review

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 92278c4. The catch now logs the parse failure via debugLogger.debug instead of swallowing silently, so misconfigured base URLs like api.minimax.io/v1 (no scheme) are visible at the detection boundary. Added a test that asserts the no-scheme case returns false.

…ug log)

- Temperature guard: replace `original == null || original === 0` with
  `original == null || !(original > 0)` so negative values and NaN are
  also rewritten to 1.0 instead of being forwarded as-is. NaN > 0 is
  false, so the single comparison catches both. New tests cover
  temperature=-0.5 and temperature=NaN.
- OUTPUT_PATTERNS: add `[/^minimax-m2\.5/i, 64K]` and a `[/^minimax-/i, 64K]`
  fallback in both packages/core and packages/vscode-ide-companion so
  M2.5 keeps its 64K output limit and any future MiniMax variant gets a
  sensible 64K default instead of the global 32K. Test covers the
  fallback path.
- isMiniMaxProvider: log the parse failure inside `catch` instead of
  swallowing silently, so misconfigured base URLs (e.g.
  `api.minimax.io/v1` without a scheme) are visible at the detection
  boundary. New test covers the no-scheme case.

Co-Authored-By: Octopus <liyuan851277048@icloud.com>
@octo-patch

Copy link
Copy Markdown
Author

@wenshao Pushed 92278c4 addressing the three latest review points (temperature lower-bound gap, OUTPUT_PATTERNS fallback, debug log for malformed base URLs). All targeted tests pass locally.

On the merge-conflict / older CI failure: main now ships its own minimax.ts (the basic MiniMaxOpenAICompatibleProvider), which conflicts add/add with this PR’s richer version. I held off on a full rebase here because squashing the two implementations is a design call rather than a mechanical fix — happy to follow up either by (a) rebasing and consolidating into a single provider class on top of main, or (b) restructuring this PR as enhancements on top of main’s provider, whichever you prefer. The previously-flagged coreToolScheduler.test.ts ensureTool failure is gone in the current branch (npx vitest run packages/core/src/core/coreToolScheduler.test.ts — 60/60 pass).

@DragonnZhang DragonnZhang left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. The MiniMax provider integration is well-implemented and follows the established patterns from sibling providers (DashScope, DeepSeek).

Key observations:

  • Hostname-based detection via new URL().hostname + Set lookup correctly addresses the CodeQL substring-matching warnings.
  • Typed destructuring for response_format removal avoids any casts and delete.
  • Temperature guard comprehensively handles null, undefined, 0, negative, NaN, and values > 1.0.
  • Token limits are consistent between packages/core and packages/vscode-ide-companion, with proper fallback patterns for future MiniMax models.
  • Test coverage is thorough (15 unit tests) including edge cases for hostname spoofing, negative temperature, and NaN.
  • All previously raised review comments have been addressed in the latest commit.

— qwen-code via Qwen Code /review

@tanzhenxin

Copy link
Copy Markdown
Collaborator

Thanks for the thorough work here, @octo-patch — the provider implementation and the write-up (URL hostname matching, temperature clamping, typed response_format removal) were all clean.

Closing this one as superseded: MiniMax-M3 support has since landed on main through other changes — there's now a dedicated MiniMax provider, token limits and modality defaults for the M-series, and a searchable MiniMax-M3 setup flow (#4668, #3677). With that in place, plus the merge conflicts that have accumulated here, a separate provider PR is no longer needed.

If anything you covered isn't reflected in the shipped support — e.g. the api.minimaxi.com domestic mirror or the M2.7 fallback limits — please open a focused follow-up and we'll take it. Really appreciate the contribution. 🙏

@tanzhenxin tanzhenxin closed this Jun 11, 2026

@danialzivehdar1992-hue danialzivehdar1992-hue left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

86458953action>: GetSearchAnalytics
provider: searchconsole_********************
siteUrl: 'autoro.io'
startDate: '2019-12-01'
endDate: '2019-12-08'
dimensions: ['query', 'page']
filters: [{"dimension":"query","operator':"contains","expression":"rpa"}]
rowLimit: 1000

{

"rows": [

{

"keys": [

"ホゲ rpa",

"https://hoge.co.jp/"

],

"clicks": 10,

"impressions": 10,

"ctr": 0.141421356,

"position": 1

},

{

"keys": [

"rpa ホゲ 使い方",

"https://hoge.co.jp/"

],

"clicks": 10,

"impressions": 10,

"ctr": 0.141421356,

"position": 1

},

{

"keys": [

"rpa ホゲ",

"https://hoge.co.jp/"

],

"clicks": 10,

"impressions": 10,

"ctr": 0.141421356,

"position": 1

},

.............

]

}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants