Loading image...Kiro
  • CLI
  • Web
  • Powers
  • Enterprise
  • Pricing
  • Docs
SIGN INDOWNLOADS
Loading image...Kiro
Loading image...Kiro
Product
  • About Kiro
  • CLI
  • Web
  • Powers
  • Pricing
  • Downloads
For
  • Enterprise
  • Startups
  • Students
Community
  • Overview
  • Ambassadors
  • Showcase
  • Discord
  • Events
Resources
  • Documentation
  • Blog
  • Changelog
  • FAQs
  • Report a bug
  • Suggest an idea
  • Billing support
Social
Site TermsLicenseResponsible AI PolicyLegalPrivacy PolicyCookie Preferences
  1. Docs
  2. IDE
  3. Models

Models


Kiro gives you access to frontier and open weight AI models. Each model has different strengths, so you can pick the right one for the job or let Kiro decide for you by selecting Auto.

Quick comparison

ModelContext windowCostRegionFreeProPro+PowerAuth
Claude Opus 4.71M2.2xus-east-1, eu-central-1✓✓✓All
Claude Opus 4.61M2.2xus-east-1, eu-central-1✓✓✓All
Claude Opus 4.5200K2.2xus-east-1, eu-central-1✓✓✓All
Claude Sonnet 4.61M1.3xus-east-1, eu-central-1✓✓✓All
Claude Sonnet 4.5200K1.3xus-east-1, eu-central-1✓✓✓✓All
Claude Sonnet 4.0200K1.3xus-east-1, eu-central-1✓✓✓✓All
Auto1.0xus-east-1, eu-central-1✓✓✓✓All
Claude Haiku 4.5200K0.4xus-east-1, eu-central-1✓✓✓All
DeepSeek 3.2128K0.25xus-east-1✓✓✓✓All
MiniMax M2.5200K0.25xus-east-1, eu-central-1✓✓✓✓All
GLM-5200K0.5xus-east-1✓✓✓✓All
MiniMax M2.1200K0.15xus-east-1, eu-central-1✓✓✓✓All
Qwen3 Coder Next256K0.05xus-east-1, eu-central-1✓✓✓✓All

Cost is relative to Auto (1.0x baseline). For example, a task that costs 10 credits on Auto would cost 22 credits on Opus, 4 credits on Haiku, or 0.5 credits on Qwen3 Coder Next.

How to switch models

Use the model dropdown in the chat interface to switch models. Your selection applies to all subsequent messages in the conversation.

Which model should you use?

Use caseModelWhy
General developmentAutoRoutes to the optimal model per task, balances quality and cost automatically
Predictable baselineSonnet 4.0Consistent behavior, no routing layers, same model every time
Strong agentic codingSonnet 4.5Sonnet 4 capabilities plus improved planning, extended autonomous operation, and better tool usage
Efficient high intelligenceSonnet 4.6Sonnet 4.5 capabilities with near-Opus intelligence, improved consistency and instruction following, more token efficient
Large codebases or specsOpus 4.7Direct upgrade from Opus 4.6 with stronger agentic coding, precise instruction following, self-verification, and 3x higher resolution vision
Large codebases or specsOpus 4.6Plans ahead across large codebases, catches its own mistakes in code review and debugging
Complex multi-system problemsOpus 4.5Strong planning and self-correction, handles ambiguity and tradeoffs across multiple systems, strong single-shot accuracy
Speed or credit savingsHaiku 4.5Near-frontier intelligence at a fraction of the cost, well suited for quick iterations and sub-agent orchestration
Minimal cost codingDeepSeek 3.2Agentic workflows and multi-step reasoning at low cost
Frontier coding at low costMiniMax M2.5Near Opus-level coding results at a fraction of the cost, strong across the full development lifecycle
Repo-scale agentic workGLM-5200K context optimized for long-horizon agentic workflows across large codebases
Multilingual programmingMiniMax M2.1Strong across Rust, Go, C++, Kotlin, TypeScript and UI generation
Long coding sessionsQwen3 Coder Next256K context with strong error recovery, the most cost-effective option available

Model availability

Model availability can vary by country or region. Kiro's model offerings align with each model provider's usage and geographic requirements for third-party services and interfaces. For more information, see each model provider's supported countries and regions, if relevant: Anthropic, Qwen, DeepSeek, Kimi, MiniMax, GLM.

Model details

Auto (recommended)

Kiro's model router. Auto combines multiple frontier models with optimization techniques to deliver the best quality-to-cost ratio. It automatically chooses the optimal model for each task and delivers Sonnet 4-class results. Auto uses best-in-class LLM models (Claude Sonnet 4 and similar) and maintains a high quality bar to ensure results compare to or exceed the individual models available to you.

Claude Opus 4.7

Anthropic's latest and most capable coding model, a direct upgrade from Opus 4.6 with notable gains on the hardest software engineering tasks. Introduces adaptive thinking: the model automatically scales its internal reasoning based on task complexity. Simple questions get fast responses; complex architectural problems get deeper analysis. You'll notice it takes more time and produces more thorough answers on hard tasks, without you configuring anything. Beyond adaptive thinking, Opus 4.7 follows instructions more precisely, verifies its own outputs before reporting back, and supports 3x higher resolution vision for working with dense screenshots and diagrams. Available in us-east-1 (N. Virginia) and eu-central-1 (Frankfurt) with cross-region inference. Learn more.

Claude Opus 4.6

Top scores on Terminal-Bench 2.0 and SWE-bench Verified for agentic coding. Stays productive over longer sessions without context drift and handles multi-million-line codebases, planning upfront and adapting as needed. Strong debugging and code review capabilities let it catch its own mistakes through careful planning and self-correction. Learn more.

Claude Opus 4.5

Handles tradeoffs and ambiguity well when working across multiple systems. Strong single-shot accuracy on complex problems where you need a correct answer on the first attempt, without iterative back-and-forth. Well suited for sophisticated software development challenges that span service boundaries. Learn more.

Claude Sonnet 4.6

A full upgrade from Sonnet 4.5 that approaches Opus 4.6 intelligence while being more token efficient. Excels at iterative development workflows and maintains context across long sessions. Handles both lead agent and subagent roles in multi-model pipelines, making it well-suited for teams using Kiro powers and custom subagents. Learn more.

Claude Sonnet 4.5

Strong agentic coding with extended autonomous operation — can work independently for hours with effective tool usage. Improved planning, system design, and security engineering compared to Sonnet 4.0. Learn more.

Claude Sonnet 4.0

Direct access to Anthropic's Claude Sonnet 4.0 for users who prefer consistent model selection. Same model for all interactions with no routing or optimization layers. Full control and complete transparency, with predictable behavior for workflows that depend on specific model characteristics. Learn more.

Claude Haiku 4.5

Anthropic's fastest model with near-frontier performance. Matches Sonnet 4 performance across reasoning and coding at more than twice the speed. Near-frontier intelligence at one-third the cost, and the first Haiku model with extended thinking capabilities. Learn more.

MiniMax M2.5

Open weight model that matches frontier-class coding performance at a fraction of the cost. Trained with reinforcement learning across hundreds of thousands of real-world environments, delivering strong results across the full development lifecycle from system design to code review. 0.25x credit multiplier with inference running in US East (N. Virginia) and EU (Frankfurt). Learn more.

GLM-5

Open weight sparse mixture-of-experts model with a 200K context window, designed for complex systems engineering and long-horizon agentic tasks. Excels at processing repository-scale context and maintaining coherence during multi-step tool use across large codebases. Well suited for cross-file migrations, full-stack feature development, and legacy refactoring where the model needs to hold the full picture. 0.5x credit multiplier with inference running in US East (N. Virginia). Learn more.

DeepSeek 3.2

Open weight model best suited for agentic workflows and code generation. Handles long tool-calling chains, stateful sessions, and multi-step reasoning well. 0.25x credit multiplier with inference running in US East (N. Virginia). Learn more.

MiniMax M2.1

Open weight model best suited for multilingual programming and UI generation. Delivers strong results across Rust, Go, C++, Kotlin, TypeScript, and others. 0.15x credit multiplier with inference running in US East (N. Virginia) and EU (Frankfurt). Learn more.

Qwen3 Coder Next

Open weight model purpose-built for coding agents with 256K context and strong error recovery. Works especially well for long agentic coding sessions in the CLI. 0.05x credit multiplier, the most cost-effective option available, with inference running in US East (N. Virginia) and EU (Frankfurt). Learn more.

How models behave differently

Not all models work the same way. Understanding these differences helps you pick the right one.

Planning depth: Opus models plan more thoroughly before acting. They consider multi-step approaches, edge cases, and potential issues upfront. Sonnet and Haiku are more direct: they start working sooner and iterate faster.

Adaptive thinking (Opus 4.7 only): Opus 4.7 introduces adaptive thinking, a distinct capability that automatically scales the model's internal reasoning based on task complexity. This is different from the general planning and self-correction that all Opus models do well. Simpler tasks get fast responses; harder problems get deeper analysis without you needing to configure anything.

Self-correction: All Opus models catch their own mistakes through planning and review. Opus 4.7 takes this further by verifying its outputs before reporting back and catching logical faults during the planning phase. If you're seeing bugs in generated code, switching to Opus can help.

Session endurance: For long-running tasks (like working through a spec), Opus models maintain focus better over extended sessions. Haiku and Sonnet are better suited for shorter, focused interactions.

Initiative level: Opus models tend to take more initiative, making broader changes when they see opportunities. Sonnet is more conservative and sticks closer to what you asked for. Choose based on whether you want the model to lead or follow.

Model lifecycle

Models in Kiro go through two stages. Each stage reflects the model's maturity and the level of support you can expect.

StageDescription
ExperimentalAvailable for early testing and may change based on feedback. Marked in the model selector with limited region availability.
ActiveFully supported and recommended for production use. Available in all supported regions.
Info

Inference requests for experimental models may be processed across multiple AWS Regions globally to optimize availability and performance.

See data protection for details on cross-region inference.

Launch dates

ModelLaunchedStatus
Claude Opus 4.7April 16, 2026Experimental
GLM-5March 31, 2026Experimental
MiniMax M2.5March 18, 2026Experimental
Claude Sonnet 4.6February 17, 2026Active
DeepSeek 3.2February 10, 2026Experimental
MiniMax M2.1February 10, 2026Experimental
Qwen3 Coder NextFebruary 10, 2026Experimental
Claude Opus 4.6February 5, 2026Active
Claude Opus 4.5November 24, 2025Active
Claude Sonnet 4.5September 29, 2025Active
AutoSeptember 17, 2025Active
Claude Sonnet 4.0September 4, 2025Active
Claude Haiku 4.5September 4, 2025Active

Best practices

  • Start with Auto for most work. It optimizes both quality and cost automatically.
  • Switch to Opus when you hit a wall on a complex problem or need sustained multi-file work.
  • Use Haiku for quick iterations, simple fixes, or when you want to conserve credits.
  • Monitor your usage in your account settings to understand how model choice affects consumption.
  • Factor model cost into your tier: If you primarily use Opus, consider Pro+ or Power for more credits. See plans and billing for details.
Page updated: May 4, 2026
Your first project
Interface