feat(bedrock): full Converse API support for cross-region inference profiles by bbo268 · Pull Request #12979 · NousResearch/hermes-agent

bbo268 · 2026-04-20T11:09:01Z

Summary

This PR adds complete Bedrock Converse API support for cross-region inference profiles (global./us./eu./ap./jp. prefixed model IDs), which are not supported by the AnthropicBedrock SDK.

Problem

The AnthropicBedrock SDK does not recognize cross-region inference profile model IDs (e.g. global.anthropic.claude-opus-4-6-v1, us.anthropic.claude-sonnet-4-6-v1). These profiles are AWS's recommended way to get automatic cross-region routing for higher availability and throughput. Users who configure these model IDs get errors because the SDK rejects the model ID format.

The Converse API path existed but was incomplete — missing prompt caching, having a hardcoded 4096 max_tokens, double-encoding images, and lacking an auxiliary client.

Changes

Route cross-region profiles through Converse API (runtime_provider.py)
- Non-prefixed Claude models continue to use AnthropicBedrock SDK
- Cross-region prefixed Claude models (global./us./eu./ap./jp.) now route through Converse API
- Non-Claude models continue to use Converse API
Prompt caching for Converse API (bedrock_adapter.py, run_agent.py)
- Implement inject_cache_points() with native Bedrock cachePoint blocks
- Uses system_and_3 strategy (up to 4 breakpoints) matching the Anthropic native path
- Enable caching policy for Bedrock Claude models in _anthropic_prompt_cache_policy()
Fix image encoding (bedrock_adapter.py)
- Bedrock Converse expects raw bytes in source.bytes (boto3 handles base64 on the wire)
- Previous code passed the base64 string directly, causing double-encoding and "Could not process image" errors
Dynamic max_tokens (run_agent.py)
- Replace hardcoded max_tokens=4096 with model-aware output cap from _get_anthropic_max_output()
- Prevents silent truncation of long responses (e.g. Opus 4.7 supports 128K output)
Client timeout and retries (bedrock_adapter.py)
- Increase read_timeout from 60s to 300s (large models with 500K+ context can have TTFT > 60s)
- Add adaptive retry config (3 attempts)
Fix usage metrics (usage_pricing.py)
- Include bedrock_converse in Anthropic-style usage parsing
- Previously cache_read/cache_creation metrics were zeroed out for Converse mode
Bedrock auxiliary client (auxiliary_client.py)
- Add _BedrockCompletionsAdapter and BedrockAuxiliaryClient for auxiliary tasks
- Allows context compression, session search, web extract, vision to use Bedrock-hosted models
- Same AWS credential chain, no separate API key needed
- Add provider aliases: aws/aws-bedrock/amazon-bedrock/amazon → bedrock

Testing

Tested in production with:

global.anthropic.claude-opus-4-6-v1 (main model)
global.anthropic.claude-sonnet-4-6-v1 (compression)
global.anthropic.claude-haiku-4-5-20251001-v1:0 (auxiliary)

Verified:

✅ Prompt caching active (~75% input cost reduction on multi-turn)
✅ Images processed correctly (vision tool works)
✅ Long outputs not truncated (tested 8K+ token responses)
✅ Cache metrics properly reported in usage stats
✅ Auxiliary tasks (compression, session search) work via Bedrock
✅ Non-cross-region Claude models still route through AnthropicBedrock SDK (backwards compatible)

Breaking Changes

None. This is purely additive — existing configurations using non-prefixed Bedrock model IDs are unaffected.

- Add inject_cache_points() to insert cachePoint blocks for Bedrock Converse API prompt caching (system_and_3 strategy, up to 4 breakpoints) - Add _model_supports_prompt_caching() allowlist (Claude family only) - Fix image base64 double-encoding: Bedrock expects raw bytes in source.bytes, not the base64 string (boto3 handles wire encoding) - Increase bedrock-runtime client timeout from 60s to 300s with adaptive retries (large models with 500K+ context can have TTFT > 60s)

…e path - Replace hardcoded max_tokens=4096 with model-aware output cap from _get_anthropic_max_output() (e.g. 128K for claude-opus-4-7) - Pass enable_caching flag to build_converse_kwargs() when prompt caching is active - Extend _anthropic_prompt_cache_policy() to return (True, True) for Bedrock Converse Claude models (was previously only native Anthropic)

… API AnthropicBedrock SDK does not support cross-region inference profiles (global./us./eu./ap./jp. prefixed model IDs). Route these models through the Converse API path which handles them natively, while keeping non-prefixed Claude models on the AnthropicBedrock SDK path.

normalize_usage() was only matching api_mode=='anthropic_messages' for cache metrics extraction. Bedrock Converse returns the same cache_read_input_tokens / cache_creation_input_tokens fields but was falling through to the generic OpenAI else branch, zeroing out all cache hit/miss statistics.

Add _BedrockCompletionsAdapter and BedrockAuxiliaryClient so that auxiliary tasks (context compression, session search, web extract, vision) can use Bedrock-hosted models without a separate API key. Uses the same AWS credential chain as the main model. Also add provider aliases: aws, aws-bedrock, amazon-bedrock, amazon → bedrock.

bbo268 added 5 commits April 20, 2026 11:07

alt-glitch added type/feature New feature or request P2 Medium — degraded but workaround exists comp/agent Core agent loop, run_agent.py, prompt builder provider/bedrock AWS Bedrock (boto3, IAM) labels Apr 22, 2026

alt-glitch mentioned this pull request May 22, 2026

feat(bedrock): add prompt caching support for Converse API #30284

Open

pgregg88 mentioned this pull request May 24, 2026

fix(pricing): strip Bedrock regional prefixes for cost lookup #19797

Closed

alt-glitch mentioned this pull request May 27, 2026

bedrock_adapter: image data URLs sent as base64 string instead of raw bytes — all image uploads rejected by Bedrock 'Failed to sanitize image' #33317

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(bedrock): full Converse API support for cross-region inference profiles#12979

feat(bedrock): full Converse API support for cross-region inference profiles#12979
bbo268 wants to merge 5 commits into
NousResearch:mainfrom
bbo268:feat/bedrock-converse-full-support

bbo268 commented Apr 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

bbo268 commented Apr 20, 2026

Summary

Problem

Changes

Testing

Breaking Changes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants