Skip to content

feat: add AWS Bedrock provider#4346

Closed
richin13 wants to merge 19 commits into
NousResearch:mainfrom
richin13:feature/bedrock-support
Closed

feat: add AWS Bedrock provider#4346
richin13 wants to merge 19 commits into
NousResearch:mainfrom
richin13:feature/bedrock-support

Conversation

@richin13

@richin13 richin13 commented Mar 31, 2026

Copy link
Copy Markdown

What does this PR do?

Adds AWS Bedrock as a first-class inference provider using the Bedrock Converse API. Hermes can run Bedrock models from the CLI, gateway, and delegated subagents using AWS access keys, AWS profiles, or IAM instance roles. The implementation also preserves opaque Bedrock model IDs and ARNs, handles cross-region inference profiles, and adds Bedrock-specific streaming/truncation behavior.

Related Issue

Fixes #3863

Type of Change

  • ✨ New feature (non-breaking change that adds functionality)

Changes Made

  • Add a dedicated Bedrock Converse adapter for request/response normalization, streaming, and tool-capability fallback
  • Wire Bedrock through auth/config/model selection/runtime resolution with api_mode="bedrock_converse"
  • Generalize runtime plumbing for platform-auth providers using uses_platform_auth, platform_credentials, and model-ID preservation
  • Extend setup to support AWS access keys, AWS profiles, and IAM instance roles
  • Support Bedrock foundation-model and inference-profile ARNs, plus cross-region model IDs and alias cleanup
  • Add setup/runtime handling for unknown model metadata by prompting for context_length
  • Add Bedrock-specific truncation handling for thinking-budget exhaustion
  • Add comprehensive tests for provider resolution, adapter behavior, streaming, integration, and run_agent paths

How to Test

  1. Sync the Bedrock + dev extras:
    uv sync --extra dev --extra bedrock
  2. Run targeted validation:
    uv run --extra dev --extra bedrock pytest -o addopts='' tests/test_bedrock_provider.py tests/agent/test_anthropic_adapter_bedrock.py tests/agent/test_bedrock_adapter.py tests/agent/test_bedrock_credentials.py tests/agent/test_bedrock_integration.py tests/agent/test_bedrock_streaming.py tests/run_agent/test_run_agent.py -q
  3. Configure Bedrock using one of:
    • AWS access keys
    • AWS_PROFILE
    • IAM instance role
  4. Run:
    hermes chat --provider bedrock --model us.anthropic.claude-sonnet-4-20250514-v1:0 -q "Hello"
  5. Optionally verify an inference-profile or foundation-model ARN via hermes setup

Checklist

Code

  • I've read the Contributing Guide
  • My commit messages follow Conventional Commits (fix(scope):, feat(scope):, etc.)
  • I searched for existing PRs to make sure this isn't a duplicate
  • My PR contains only changes related to this fix/feature (no unrelated commits)
  • I've run pytest tests/ -q and all tests pass
  • I've added tests for my changes (required for bug fixes, strongly encouraged for features)
  • I've tested on my platform: Ubuntu 24.04 (Pop!_OS)

Documentation & Housekeeping

  • I've updated relevant documentation (README, docs/, docstrings) — or N/A
  • I've updated cli-config.yaml.example if I added/changed config keys — or N/A
  • I've updated CONTRIBUTING.md or AGENTS.md if I changed architecture or workflows — or N/A
  • I've considered cross-platform impact (Windows, macOS) per the compatibility guide — or N/A
  • I've updated tool descriptions/schemas if I changed tool behavior — or N/A

Adds first-class Bedrock support using the anthropic SDK's native
AnthropicBedrock client (anthropic[bedrock] extra), which handles
AWS SigV4 signing internally via botocore.

- hermes_cli/auth.py: add "bedrock" to PROVIDER_REGISTRY with
  auth_type="aws_credentials"; add aliases (aws, aws-bedrock,
  amazon-bedrock); add auto-detect from AWS_ACCESS_KEY_ID +
  AWS_SECRET_ACCESS_KEY (lowest priority in chain); add
  resolve_bedrock_credentials() and get_bedrock_auth_status()
- hermes_cli/models.py: add provider label, aliases, and curated
  model list (us.* and global.* cross-region IDs)
- hermes_cli/config.py: register AWS_ACCESS_KEY_ID,
  AWS_SECRET_ACCESS_KEY, AWS_SESSION_TOKEN, AWS_REGION as optional
  env vars with category "provider"
- hermes_cli/runtime_provider.py: resolve Bedrock in both
  _resolve_explicit_runtime() and resolve_runtime_provider(),
  returning api_mode="anthropic_messages" with AWS credential fields
- hermes_cli/setup.py: add default Bedrock model list for setup wizard
- agent/anthropic_adapter.py: add build_anthropic_bedrock_client();
  add is_bedrock param to normalize_model_name() and
  build_anthropic_kwargs() so Bedrock model IDs (us.anthropic.*, ARNs)
  are never transformed
- pyproject.toml: add [bedrock] optional dependency group
Propagates AWS credentials across all execution paths so Bedrock works
in CLI, gateway (Telegram/Discord/Slack/etc.), and subagent delegation.

- run_agent.py: add aws_access_key/secret/session_token/region params
  to AIAgent.__init__(); set api_mode="anthropic_messages" for bedrock;
  build AnthropicBedrock client in the anthropic_messages branch; pass
  is_bedrock to build_anthropic_kwargs() to preserve model IDs
- cli.py: skip api_key/base_url validation for bedrock; store AWS creds
  from runtime dict; pass them to AIAgent; include AWS fields in
  credentials_changed detection to handle STS token rotation
- gateway/run.py: propagate AWS fields from _resolve_runtime_agent_kwargs();
  pass them through _resolve_turn_agent_config(); bypass api_key guard
  for bedrock provider in all four check sites
- agent/smart_model_routing.py: include AWS fields in both primary
  passthrough runtime dicts so smart routing fallbacks keep credentials
- tools/delegate_tool.py: inherit AWS credentials from parent agent so
  subagents work under Bedrock
- hermes_cli/main.py: add _model_flow_bedrock() setup wizard — prompts
  for credentials, saves to .env, selects from curated model list or
  free-text ARN/custom ID; add "AWS Bedrock" to provider selection menu
- tests/conftest.py: clear AWS_ACCESS_KEY_ID/SECRET/SESSION_TOKEN/REGION
  in the autouse isolation fixture to prevent real credentials from
  interfering with provider auto-detection tests
@richin13 richin13 changed the title feat: add AWS Bedrock provider — core infrastructure feat: add AWS Bedrock provider Mar 31, 2026
- agent/anthropic_adapter.py: remove _COMMON_BETAS from AnthropicBedrock
  client — Bedrock may not support all beta headers and would return a
  4xx on every request; wrap AnthropicBedrock() construction in
  try/except to give a clear error message for missing boto3 or invalid
  credentials/region
- run_agent.py: pass is_bedrock=(self.provider == "bedrock") to the
  three secondary build_anthropic_kwargs call sites (memory flush,
  iteration-limit summary, retry summary) so Bedrock model IDs are
  never transformed at those call sites
- cli.py: include aws_access_key/secret/session_token/region in the
  primary dict passed to resolve_turn_route(), so smart routing
  fallbacks carry the correct Bedrock credentials
- agent/smart_model_routing.py: include aws_* fields in the smart
  routing success-path runtime dict, mirroring the two fallback dicts
  that were already fixed in the previous commit
Replace Bedrock-specific abstractions with generic ones so adding
Vertex, Azure AI Foundry, or similar platform-auth providers requires
minimal changes to the routing/plumbing layers.

Three key generalizations:

1. is_bedrock -> preserve_model_id (anthropic_adapter.py, run_agent.py)
   Platform-auth providers use opaque model IDs that must not be
   transformed. The flag name no longer references a specific provider.
   AIAgent derives this from _PLATFORM_AUTH_PROVIDERS at init time.

2. uses_platform_auth on ProviderConfig (auth.py)
   New boolean field replaces hardcoded != "bedrock" guard checks in
   cli.py and gateway/run.py. Runtime dicts carry the flag so consumers
   never need to import the registry. Adds is_platform_auth_provider()
   helper for convenience.

3. platform_credentials envelope (runtime_provider.py, cli.py,
   gateway/run.py, smart_model_routing.py, delegate_tool.py, run_agent.py)
   Replaces 4 flat aws_* fields with a single opaque dict that flows
   untouched through all routing layers. Only two endpoints unpack it:
   resolve_bedrock_credentials() (produces it) and
   build_anthropic_bedrock_client() (consumes it). Adding a new
   platform-auth provider now requires zero changes to the routing
   plumbing.
Replace the duplicated _PLATFORM_AUTH_PROVIDERS frozenset in
run_agent.py with a direct call to is_platform_auth_provider() from
hermes_cli/auth.py. No circular import risk since neither auth.py nor
runtime_provider.py imports run_agent at the module level.
@ptlally

ptlally commented Apr 1, 2026

Copy link
Copy Markdown

Nice work on this — especially the CLI integration. FWIW I've been working on a similar adapter in my fork (https://github.com/ptlally/hermes-agent/tree/feature/bedrock-provider) that takes a different approach — uses the Converse API directly via httpx + botocore SigV4 rather than the AnthropicBedrock client. The main upside is it supports non-Anthropic models on Bedrock (Nova, Llama, Mistral, etc.) and handles IAM instance roles natively with no env vars needed. Your CLI work + my adapter backend could be a nice combo — would you be interested in collaborating on a combined version? Happy to merge my adapter into your branch if that's easier.

62 tests across two files covering auth registry, credential resolution,
provider auto-detection priority, runtime provider resolution, model
catalog, AnthropicBedrock client builder, and model name preservation.
@richin13

richin13 commented Apr 1, 2026

Copy link
Copy Markdown
Author

Nice work on this — especially the CLI integration. FWIW I've been working on a similar adapter in my fork (ptlally/hermes-agent@feature/bedrock-provider) that takes a different approach — uses the Converse API directly via httpx + botocore SigV4 rather than the AnthropicBedrock client. The main upside is it supports non-Anthropic models on Bedrock (Nova, Llama, Mistral, etc.) and handles IAM instance roles natively with no env vars needed. Your CLI work + my adapter backend could be a nice combo — would you be interested in collaborating on a combined version? Happy to merge my adapter into your branch if that's easier.

@ptlally that's a valid point - this is anthropic only as-is
env vars are mostly for the automatic detection of models and the setup wizard but the current implementation does default to the full credential chain for resolving aws secrets if the env vars are not specified

i'm down to merge the two and collaborate on it 👍🏼

@ptlally

ptlally commented Apr 1, 2026

Copy link
Copy Markdown

Awesome! I'm going to fork your branch and open a PR into it with my Converse API adapter changes. Does that work for you, or would you prefer I do it differently?

@richin13

richin13 commented Apr 1, 2026

Copy link
Copy Markdown
Author

@ptlally that works fine 👍🏼

ptlally and others added 10 commits April 2, 2026 01:06
Swap the bedrock provider from anthropic_messages (AnthropicBedrock SDK)
to bedrock_converse (Converse API via httpx + SigV4). Supports all
Bedrock models (Claude, Nova, Llama, Mistral, DeepSeek, etc.) and
handles IAM instance roles natively.

Includes tool-not-supported fallback, streaming via ConverseStream,
prompt caching for Claude models, and property-based tests.
Add AWS Profile and IAM instance role auth options to the bedrock
provider setup wizard. Users on EC2/ECS no longer need explicit
access keys. Also adds bedrock to _has_any_provider_configured
for IAM role detection.
- Add centralized _strip_region_prefix() helper for handling all known
  Bedrock region prefixes (us, eu, apac, global, us-gov, jp, au)
- Add base Claude 4 model entries to metadata table for prefix fallback
- Add property tests for all new region prefixes
- Fixes: global.anthropic.* models now correctly resolve metadata
  instead of falling back to generic 8192 max_output_tokens
- Add bedrock_converse branch to _trunc_content extraction in the finish_reason='length' block of run_conversation()
- Extracts text content from the normalized Bedrock response tuple (assistant_message.content) to detect when the model spent all output tokens on reasoning
- Add 3 unit tests: partial text (not exhausted), reasoning-only (exhausted), empty content (exhausted)
- Tests mock _interruptible_api_call directly with pre-normalized tuples, matching the pattern used by existing Nous credential refresh tests
…arsing

- Parse foundation-model and inference-profile ARNs to extract model IDs for metadata lookup
- Pass through opaque application-inference-profile ARNs (fall back to auto-detect)
- Add has_bedrock_model_metadata() helper for setup wizard to check resolvability
- Prompt for context_length when model metadata is unknown (matches custom endpoint flow)
- Extract _parse_token_count() shared helper supporting k/K/m/M suffixes (e.g. 200k, 1.2M)
- Replace inline int() parsing in both custom endpoint and Bedrock setup flows
…ResponseStream in addition to vanilla bedrock:InvokeModel during setup flow.
- Remove redundant cross-region duplicates from metadata table (_strip_region_prefix handles these)
- Fix aliases to resolve to base model IDs instead of hardcoding us. prefix
- Add short-name entries for inference profile resolution (e.g. anthropic.claude-sonnet-4-6)
- Expand metadata table with tool-capable models from Bedrock inference profiles
- Fix test_cross_region_prefix_preserved to not depend on prefixed metadata entries
…port

# Conflicts:
#	agent/anthropic_adapter.py
#	agent/smart_model_routing.py
#	cli.py
#	gateway/run.py
#	hermes_cli/auth.py
#	hermes_cli/main.py
#	pyproject.toml
#	run_agent.py
#	tests/conftest.py
#	tools/delegate_tool.py
#	uv.lock
@richin13 richin13 marked this pull request as ready for review April 10, 2026 03:03
@richin13

Copy link
Copy Markdown
Author

@ptlally merge conflicts has been resolved - I'm going out on vacation so I'll be out for 1 week, can you take over this? Mainly try it out locally and then see if we can get a maintainer's attention in order to get it merged :D

Thank you!

@JiaDe-Wu

Copy link
Copy Markdown
Contributor

Hey @richin13 — nice work on this. We've got a parallel implementation in #7920 that covers similar ground (Converse API adapter, IAM credential chain, streaming, tool calling). A few things we added on top that might be worth comparing:

  • Dynamic model discovery (ListFoundationModels + ListInferenceProfiles) with smart filtering — excludes embedding/image/voice models, deduplicates inference profiles vs foundation IDs
  • Bedrock API Key support (bedrock-mantle endpoint) as an alternative auth path
  • hermes doctor Bedrock diagnostics + hermes auth AWS identity display
  • Error classification for ThrottlingException, ModelNotReadyException, context overflow
  • Bedrock model pricing in /usage
  • IMDS credential detection for EC2/ECS/Lambda (boto3 fallback when no env vars are set)
  • Guardrails config support via config.yaml
  • One-click CloudFormation deployment template: sample-hermes-agent-on-aws-with-bedrock

107 tests, 4 models verified end-to-end on EC2. Happy to collaborate if the maintainers want to merge parts from both PRs.

@ptlally

ptlally commented Apr 12, 2026

Copy link
Copy Markdown

Hey @JiaDe-Wu, thanks for the heads up, this is super helpful!

I’ve been working with @richin13’s branch and just finished getting it rebased + tested locally (tool calls, multi-turn, etc.), and was planning to validate on AWS next.

I haven’t had a chance to dig into #7920 yet, but it looks very aligned with the direction we were heading. I’ll take a closer look and happy to help converge the approaches so we can get a single clean implementation merged

@ptlally

ptlally commented Apr 12, 2026

Copy link
Copy Markdown

Also — I haven’t had a chance to dig in fully yet, but quick question: does the dynamic model discovery have a fallback for environments where ListFoundationModels / ListInferenceProfiles aren’t permitted?

I do like the dynamic discovery approach but we had intentionally leaned toward a static list initially to avoid requiring those permissions, so just curious how that case is handled.

@teknium1

Copy link
Copy Markdown
Contributor

Thanks for this thorough PR, @richin13 and @ptlally — the work here (and the collaboration with @JiaDe-Wu) clearly helped shape what landed on main.

This is an automated hermes-sweeper review.

AWS Bedrock support has since been implemented on main and shipped in v2026.4.16, so this PR is superseded.

Evidence:

  • agent/bedrock_adapter.py — native Converse API adapter, added by commit 0cb8c51fa ('feat: native AWS Bedrock provider via Converse API'), primarily salvaged from the parallel PR feat: native AWS Bedrock provider via Converse API #7920 by @JiaDe-Wu
  • agent/transports/bedrock.pyBedrockTransport wired into the pluggable transport layer
  • hermes_cli/doctor.pyhermes doctor Bedrock diagnostics (IAM, ListFoundationModels)
  • Dynamic model discovery, IAM credential chain, streaming, guardrails, and hermes auth AWS identity display are all present
  • Confirmed contained in release tag v2026.4.16 via git tag --contains 0cb8c51fa

The discussion in this thread (especially @JiaDe-Wu's comment noting the parallel work in #7920) directly informed the salvage path. Thanks again to everyone who contributed.

@teknium1 teknium1 closed this Apr 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature]: Native AWS Bedrock provider support

4 participants