Skip to content

bedrock: adaptive-thinking models truncate at 4096 tokens #4848

@madeleineostoja

Description

@madeleineostoja

What happened?

Assistant turns on amazon-bedrock with adaptive-thinking Claude models (Opus 4.6, Opus 4.7, Sonnet 4.6) are silently truncated at exactly 4096 output tokens with stopReason: "length", regardless of the model's actual maxTokens (128000 in pi-ai's registry) or any user-configured modelOverrides.maxTokens.

Calling the same model directly via the AWS CLI with explicit maxTokens: 6000 returns cleanly with outputTokens: 6000. AWS will produce well past 4096 when the field is set. Cap is on Pi's side.

Steps to reproduce

  1. Run pi against a Bedrock adaptive-thinking model:
pi --provider amazon-bedrock --model us.anthropic.claude-opus-4-7
  1. Send a single prompt that elicits >4096 output tokens of plain text, no tool use
  2. Inspect the session JSONL — assistant turn ends with stopReason: "length", usage.output: 4096.

Expected behavior

Adaptive anthropic models on Bedrock should not terminate at 4096 tokens

Version

0.75.4

Metadata

Metadata

Assignees

Labels

bugSomething isn't workinginprogressIssue is being worked on

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions