feat: add AWS Bedrock provider#4346
Conversation
Adds first-class Bedrock support using the anthropic SDK's native AnthropicBedrock client (anthropic[bedrock] extra), which handles AWS SigV4 signing internally via botocore. - hermes_cli/auth.py: add "bedrock" to PROVIDER_REGISTRY with auth_type="aws_credentials"; add aliases (aws, aws-bedrock, amazon-bedrock); add auto-detect from AWS_ACCESS_KEY_ID + AWS_SECRET_ACCESS_KEY (lowest priority in chain); add resolve_bedrock_credentials() and get_bedrock_auth_status() - hermes_cli/models.py: add provider label, aliases, and curated model list (us.* and global.* cross-region IDs) - hermes_cli/config.py: register AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_SESSION_TOKEN, AWS_REGION as optional env vars with category "provider" - hermes_cli/runtime_provider.py: resolve Bedrock in both _resolve_explicit_runtime() and resolve_runtime_provider(), returning api_mode="anthropic_messages" with AWS credential fields - hermes_cli/setup.py: add default Bedrock model list for setup wizard - agent/anthropic_adapter.py: add build_anthropic_bedrock_client(); add is_bedrock param to normalize_model_name() and build_anthropic_kwargs() so Bedrock model IDs (us.anthropic.*, ARNs) are never transformed - pyproject.toml: add [bedrock] optional dependency group
Propagates AWS credentials across all execution paths so Bedrock works in CLI, gateway (Telegram/Discord/Slack/etc.), and subagent delegation. - run_agent.py: add aws_access_key/secret/session_token/region params to AIAgent.__init__(); set api_mode="anthropic_messages" for bedrock; build AnthropicBedrock client in the anthropic_messages branch; pass is_bedrock to build_anthropic_kwargs() to preserve model IDs - cli.py: skip api_key/base_url validation for bedrock; store AWS creds from runtime dict; pass them to AIAgent; include AWS fields in credentials_changed detection to handle STS token rotation - gateway/run.py: propagate AWS fields from _resolve_runtime_agent_kwargs(); pass them through _resolve_turn_agent_config(); bypass api_key guard for bedrock provider in all four check sites - agent/smart_model_routing.py: include AWS fields in both primary passthrough runtime dicts so smart routing fallbacks keep credentials - tools/delegate_tool.py: inherit AWS credentials from parent agent so subagents work under Bedrock - hermes_cli/main.py: add _model_flow_bedrock() setup wizard — prompts for credentials, saves to .env, selects from curated model list or free-text ARN/custom ID; add "AWS Bedrock" to provider selection menu - tests/conftest.py: clear AWS_ACCESS_KEY_ID/SECRET/SESSION_TOKEN/REGION in the autouse isolation fixture to prevent real credentials from interfering with provider auto-detection tests
- agent/anthropic_adapter.py: remove _COMMON_BETAS from AnthropicBedrock client — Bedrock may not support all beta headers and would return a 4xx on every request; wrap AnthropicBedrock() construction in try/except to give a clear error message for missing boto3 or invalid credentials/region - run_agent.py: pass is_bedrock=(self.provider == "bedrock") to the three secondary build_anthropic_kwargs call sites (memory flush, iteration-limit summary, retry summary) so Bedrock model IDs are never transformed at those call sites - cli.py: include aws_access_key/secret/session_token/region in the primary dict passed to resolve_turn_route(), so smart routing fallbacks carry the correct Bedrock credentials - agent/smart_model_routing.py: include aws_* fields in the smart routing success-path runtime dict, mirroring the two fallback dicts that were already fixed in the previous commit
Replace Bedrock-specific abstractions with generic ones so adding Vertex, Azure AI Foundry, or similar platform-auth providers requires minimal changes to the routing/plumbing layers. Three key generalizations: 1. is_bedrock -> preserve_model_id (anthropic_adapter.py, run_agent.py) Platform-auth providers use opaque model IDs that must not be transformed. The flag name no longer references a specific provider. AIAgent derives this from _PLATFORM_AUTH_PROVIDERS at init time. 2. uses_platform_auth on ProviderConfig (auth.py) New boolean field replaces hardcoded != "bedrock" guard checks in cli.py and gateway/run.py. Runtime dicts carry the flag so consumers never need to import the registry. Adds is_platform_auth_provider() helper for convenience. 3. platform_credentials envelope (runtime_provider.py, cli.py, gateway/run.py, smart_model_routing.py, delegate_tool.py, run_agent.py) Replaces 4 flat aws_* fields with a single opaque dict that flows untouched through all routing layers. Only two endpoints unpack it: resolve_bedrock_credentials() (produces it) and build_anthropic_bedrock_client() (consumes it). Adding a new platform-auth provider now requires zero changes to the routing plumbing.
Replace the duplicated _PLATFORM_AUTH_PROVIDERS frozenset in run_agent.py with a direct call to is_platform_auth_provider() from hermes_cli/auth.py. No circular import risk since neither auth.py nor runtime_provider.py imports run_agent at the module level.
|
Nice work on this — especially the CLI integration. FWIW I've been working on a similar adapter in my fork (https://github.com/ptlally/hermes-agent/tree/feature/bedrock-provider) that takes a different approach — uses the Converse API directly via httpx + botocore SigV4 rather than the AnthropicBedrock client. The main upside is it supports non-Anthropic models on Bedrock (Nova, Llama, Mistral, etc.) and handles IAM instance roles natively with no env vars needed. Your CLI work + my adapter backend could be a nice combo — would you be interested in collaborating on a combined version? Happy to merge my adapter into your branch if that's easier. |
62 tests across two files covering auth registry, credential resolution, provider auto-detection priority, runtime provider resolution, model catalog, AnthropicBedrock client builder, and model name preservation.
@ptlally that's a valid point - this is anthropic only as-is i'm down to merge the two and collaborate on it 👍🏼 |
|
Awesome! I'm going to fork your branch and open a PR into it with my Converse API adapter changes. Does that work for you, or would you prefer I do it differently? |
|
@ptlally that works fine 👍🏼 |
Swap the bedrock provider from anthropic_messages (AnthropicBedrock SDK) to bedrock_converse (Converse API via httpx + SigV4). Supports all Bedrock models (Claude, Nova, Llama, Mistral, DeepSeek, etc.) and handles IAM instance roles natively. Includes tool-not-supported fallback, streaming via ConverseStream, prompt caching for Claude models, and property-based tests.
Add AWS Profile and IAM instance role auth options to the bedrock provider setup wizard. Users on EC2/ECS no longer need explicit access keys. Also adds bedrock to _has_any_provider_configured for IAM role detection.
- Add centralized _strip_region_prefix() helper for handling all known Bedrock region prefixes (us, eu, apac, global, us-gov, jp, au) - Add base Claude 4 model entries to metadata table for prefix fallback - Add property tests for all new region prefixes - Fixes: global.anthropic.* models now correctly resolve metadata instead of falling back to generic 8192 max_output_tokens
- Add bedrock_converse branch to _trunc_content extraction in the finish_reason='length' block of run_conversation() - Extracts text content from the normalized Bedrock response tuple (assistant_message.content) to detect when the model spent all output tokens on reasoning - Add 3 unit tests: partial text (not exhausted), reasoning-only (exhausted), empty content (exhausted) - Tests mock _interruptible_api_call directly with pre-normalized tuples, matching the pattern used by existing Nous credential refresh tests
…Bedrock models via AWS credentials'
…arsing - Parse foundation-model and inference-profile ARNs to extract model IDs for metadata lookup - Pass through opaque application-inference-profile ARNs (fall back to auto-detect) - Add has_bedrock_model_metadata() helper for setup wizard to check resolvability - Prompt for context_length when model metadata is unknown (matches custom endpoint flow) - Extract _parse_token_count() shared helper supporting k/K/m/M suffixes (e.g. 200k, 1.2M) - Replace inline int() parsing in both custom endpoint and Bedrock setup flows
…ResponseStream in addition to vanilla bedrock:InvokeModel during setup flow.
- Remove redundant cross-region duplicates from metadata table (_strip_region_prefix handles these) - Fix aliases to resolve to base model IDs instead of hardcoding us. prefix - Add short-name entries for inference profile resolution (e.g. anthropic.claude-sonnet-4-6) - Expand metadata table with tool-capable models from Bedrock inference profiles - Fix test_cross_region_prefix_preserved to not depend on prefixed metadata entries
…port # Conflicts: # agent/anthropic_adapter.py # agent/smart_model_routing.py # cli.py # gateway/run.py # hermes_cli/auth.py # hermes_cli/main.py # pyproject.toml # run_agent.py # tests/conftest.py # tools/delegate_tool.py # uv.lock
|
@ptlally merge conflicts has been resolved - I'm going out on vacation so I'll be out for 1 week, can you take over this? Mainly try it out locally and then see if we can get a maintainer's attention in order to get it merged :D Thank you! |
|
Hey @richin13 — nice work on this. We've got a parallel implementation in #7920 that covers similar ground (Converse API adapter, IAM credential chain, streaming, tool calling). A few things we added on top that might be worth comparing:
107 tests, 4 models verified end-to-end on EC2. Happy to collaborate if the maintainers want to merge parts from both PRs. |
|
Hey @JiaDe-Wu, thanks for the heads up, this is super helpful! I’ve been working with @richin13’s branch and just finished getting it rebased + tested locally (tool calls, multi-turn, etc.), and was planning to validate on AWS next. I haven’t had a chance to dig into #7920 yet, but it looks very aligned with the direction we were heading. I’ll take a closer look and happy to help converge the approaches so we can get a single clean implementation merged |
|
Also — I haven’t had a chance to dig in fully yet, but quick question: does the dynamic model discovery have a fallback for environments where ListFoundationModels / ListInferenceProfiles aren’t permitted? I do like the dynamic discovery approach but we had intentionally leaned toward a static list initially to avoid requiring those permissions, so just curious how that case is handled. |
|
Thanks for this thorough PR, @richin13 and @ptlally — the work here (and the collaboration with @JiaDe-Wu) clearly helped shape what landed on main. This is an automated hermes-sweeper review. AWS Bedrock support has since been implemented on Evidence:
The discussion in this thread (especially @JiaDe-Wu's comment noting the parallel work in #7920) directly informed the salvage path. Thanks again to everyone who contributed. |
What does this PR do?
Adds AWS Bedrock as a first-class inference provider using the Bedrock Converse API. Hermes can run Bedrock models from the CLI, gateway, and delegated subagents using AWS access keys, AWS profiles, or IAM instance roles. The implementation also preserves opaque Bedrock model IDs and ARNs, handles cross-region inference profiles, and adds Bedrock-specific streaming/truncation behavior.
Related Issue
Fixes #3863
Type of Change
Changes Made
api_mode="bedrock_converse"uses_platform_auth,platform_credentials, and model-ID preservationcontext_lengthrun_agentpathsHow to Test
uv sync --extra dev --extra bedrockuv run --extra dev --extra bedrock pytest -o addopts='' tests/test_bedrock_provider.py tests/agent/test_anthropic_adapter_bedrock.py tests/agent/test_bedrock_adapter.py tests/agent/test_bedrock_credentials.py tests/agent/test_bedrock_integration.py tests/agent/test_bedrock_streaming.py tests/run_agent/test_run_agent.py -qAWS_PROFILEhermes chat --provider bedrock --model us.anthropic.claude-sonnet-4-20250514-v1:0 -q "Hello"hermes setupChecklist
Code
fix(scope):,feat(scope):, etc.)pytest tests/ -qand all tests passDocumentation & Housekeeping
docs/, docstrings) — or N/Acli-config.yaml.exampleif I added/changed config keys — or N/ACONTRIBUTING.mdorAGENTS.mdif I changed architecture or workflows — or N/A