bedrock: Model streamlining and cleanup#49287
Conversation
Remove legacy models and consolidate to: - Anthropic Claude 4+ (11 variants with thinking modes) - Meta Llama 4 Scout/Maverick - Google Gemma 3 (4B, 12B, 27B) - Mistral (Magistral Small, Large 3, Pixtral Large) - Qwen3 VL 235B - Amazon Nova (Lite, Pro, Premier, Nova 2 Lite) - OpenAI GPT OSS (20B, 120B) - MiniMax M2 - Moonshot Kimi K2 Thinking - DeepSeek R1/V3 Update cross-region inference for all regional prefixes: us, eu, apac, au, jp, ca, us-gov, global Co-authored-by: Ona <no-reply@ona.com>
Co-authored-by: Ona <no-reply@ona.com>
- Replace anyhow round-trip with typed BedrockError variants (Validation, RateLimited, ServiceUnavailable, AccessDenied, InternalServer) - Map SDK ConverseStreamError in the crate, not the provider - Split send vs recv errors: outer Result<..., BedrockError> for API errors, inner stream Result<..., anyhow::Error> for transport errors - Gate tool configs on supports_tool_use - Add emitted_tool_use flag to override stop reason for tool-calling models - Remove dead ClientError/ExtensionError variants
| GptOss120B, | ||
|
|
||
| // MiniMax models | ||
| #[serde(rename = "minimax-m2")] |
There was a problem hiding this comment.
aws bedrock list-foundation-models lists minimax.minimax-m2.1, but docs don't.
I expect minimax m2.5 will be available soon too.
There was a problem hiding this comment.
agreed -- will keep an eye open for the new models.
| #[serde(rename = "qwen3-coder-30b")] | ||
| Qwen3Coder30B, | ||
| #[serde(rename = "qwen3-coder-next")] | ||
| Qwen3CoderNext, |
There was a problem hiding this comment.
This one doesn't show up in list-foundation-models or docs, but it is mentioned in a recent post about new models that mentions DeepSeek V3.2, MiniMax M2.1, GLM 4.7, GLM 4.7 Flash, Kimi K2.5, and Qwen3 Coder Next.
It does show up in the openapi compatible (bedrock mantle) model list endpoint.
Should the docs be updated to point people at using that to access other models that aren't explicitly handled in this provider?
There was a problem hiding this comment.
List foundation models responses actually varies per region -- not to criticize too aggressively -- but the docs suck right now, we have several conversations already underway.
These are aggregated across us-east-1 and us-west-2.
| DeepSeekV3_1, | ||
| #[serde(rename = "deepseek-v3-2")] | ||
| DeepSeekV3_2, | ||
|
|
There was a problem hiding this comment.
Thoughts on adding GLM models? Another case of them showing up in the list-foundation-models output, but not docs.
There was a problem hiding this comment.
+100 i personally love the GLM models -- feel free to add those in through a PR / I'll get to it eventually.
Release Notes: - Improved Bedrock error messages: region-locked models ask the user to try a different region, rate limits and access errors are reported cleanly instead of as raw API responses - Streamlined Bedrock model list to 39 curated models - Fixed API errors when using non-tool models in agent threads --------- Co-authored-by: Ona <no-reply@ona.com>
Release Notes: