Skip to content

feat: add Amazon Bedrock as inference provider#8832

Closed
naelmohammad wants to merge 1 commit into
NousResearch:mainfrom
naelmohammad:bedrock-integration
Closed

feat: add Amazon Bedrock as inference provider#8832
naelmohammad wants to merge 1 commit into
NousResearch:mainfrom
naelmohammad:bedrock-integration

Conversation

@naelmohammad

@naelmohammad naelmohammad commented Apr 13, 2026

Copy link
Copy Markdown

Add Amazon Bedrock as a first-class inference provider using the Converse API via boto3. Supports three authentication methods: Bedrock API keys (Bearer tokens), AWS profiles (~/.aws/credentials), and AWS access key pairs.

  • agent/bedrock_adapter.py — Converse API adapter (message conversion, tool schema translation, response normalization)

  • hermes_cli/auth.py — ProviderConfig, _resolve_bedrock_base_url(), region-aware endpoint construction, provider aliases

  • hermes_cli/providers.py — HermesOverlay with bedrock_converse transport, aliases, labels

  • hermes_cli/models.py — Curated model list with global/eu/us inference profile IDs and bare model IDs

  • hermes_cli/config.py — AWS_BEARER_TOKEN_BEDROCK and AWS_BEDROCK_REGION env vars

  • hermes_cli/runtime_provider.py — bedrock_converse in valid API modes

  • agent/models_dev.py — PROVIDER_TO_MODELS_DEV mapping

  • New api_mode 'bedrock_converse' with full lifecycle support: init, switch_model, _build_api_kwargs, _interruptible_api_call, response validation, finish_reason extraction, response normalization

  • Safe vars() call in error logging (handles dict responses from boto3)

  • Dedicated _model_flow_bedrock() with interactive setup:

    • Auth picker: existing API key, new API key, AWS profile, access keys
    • Region selector with common regions + custom input
    • Model selection from curated list
  • Persists api_mode=bedrock_converse in config.yaml

  • Added boto3 to pyproject.toml

Bedrock models use cross-region inference profile prefixes:

  • global. — works from any region
  • eu. — EU cross-region routing
  • us. — US cross-region routing
  • bare — single-region direct access

What does this PR do?

Related Issue

Fixes #
Adds Amazon Bedrock support as an inference provider using boto3 client.

Type of Change

Feature enhancement adding additional provider.

  • 🐛 Bug fix (non-breaking change that fixes an issue)
  • ✨ New feature (non-breaking change that adds functionality)
  • 🔒 Security fix
  • 📝 Documentation update
  • ✅ Tests (adding or improving test coverage)
  • ♻️ Refactor (no behavior change)
  • 🎯 New skill (bundled or hub)

Changes Made

  • agent/bedrock_adapter.py — Converse API adapter (message conversion, tool schema translation, response normalization)

  • hermes_cli/auth.py — ProviderConfig, _resolve_bedrock_base_url(), region-aware endpoint construction, provider aliases

  • hermes_cli/providers.py — HermesOverlay with bedrock_converse transport, aliases, labels

  • hermes_cli/models.py — Curated model list with global/eu/us inference profile IDs and bare model IDs

  • hermes_cli/config.py — AWS_BEARER_TOKEN_BEDROCK and AWS_BEDROCK_REGION env vars

  • hermes_cli/runtime_provider.py — bedrock_converse in valid API modes

  • agent/models_dev.py — PROVIDER_TO_MODELS_DEV mapping

How to Test

  1. launch hermes
  2. select one of the anthropic models from Amazon Bedrock:
  3. /model
    Current: global.anthropic.claude-opus-4-6-v1 on Amazon Bedrock
    Amazon Bedrock [--provider bedrock] (current):
    global.anthropic.claude-opus-4-6-v1, global.anthropic.claude-sonnet-4-6, global.anthropic.claude-haiku-4-5-20251001-v1:0, anthropic.claude-opus-4-6-v1, anthropic.claude-sonnet-4-6, anthropic.claude-haiku-4-5-20251001-v1:0 (+4 more)

Checklist

Code

  • I've read the Contributing Guide
  • My commit messages follow Conventional Commits (fix(scope):, feat(scope):, etc.)
  • I searched for existing PRs to make sure this isn't a duplicate
  • My PR contains only changes related to this fix/feature (no unrelated commits)
  • I've run pytest tests/ -q and all tests pass
  • I've added tests for my changes (required for bug fixes, strongly encouraged for features)
  • I've tested on my platform:

Documentation & Housekeeping

  • I've updated relevant documentation (README, docs/, docstrings) — or N/A
  • I've updated cli-config.yaml.example if I added/changed config keys — or N/A
  • I've updated CONTRIBUTING.md or AGENTS.md if I changed architecture or workflows — or N/A
  • I've considered cross-platform impact (Windows, macOS) per the compatibility guide — or N/A
  • I've updated tool descriptions/schemas if I changed tool behavior — or N/A

For New Skills

  • This skill is broadly useful to most users (if bundled) — see Contributing Guide
  • SKILL.md follows the standard format (frontmatter, trigger conditions, steps, pitfalls)
  • No external dependencies that aren't already available (prefer stdlib, curl, existing Hermes tools)
  • I've tested the skill end-to-end: hermes --toolsets skills -q "Use the X skill to do Y"

Screenshots / Logs

Add Amazon Bedrock as a first-class inference provider using the Converse
API via boto3. Supports three authentication methods: Bedrock API keys
(Bearer tokens), AWS profiles (~/.aws/credentials), and AWS access key
pairs.

- agent/bedrock_adapter.py — Converse API adapter (message conversion,
  tool schema translation, response normalization)

- hermes_cli/auth.py — ProviderConfig, _resolve_bedrock_base_url(),
  region-aware endpoint construction, provider aliases
- hermes_cli/providers.py — HermesOverlay with bedrock_converse transport,
  aliases, labels
- hermes_cli/models.py — Curated model list with global/eu/us inference
  profile IDs and bare model IDs
- hermes_cli/config.py — AWS_BEARER_TOKEN_BEDROCK and AWS_BEDROCK_REGION
  env vars
- hermes_cli/runtime_provider.py — bedrock_converse in valid API modes
- agent/models_dev.py — PROVIDER_TO_MODELS_DEV mapping

- New api_mode 'bedrock_converse' with full lifecycle support:
  init, switch_model, _build_api_kwargs, _interruptible_api_call,
  response validation, finish_reason extraction, response normalization
- Safe vars() call in error logging (handles dict responses from boto3)

- Dedicated _model_flow_bedrock() with interactive setup:
  - Auth picker: existing API key, new API key, AWS profile, access keys
  - Region selector with common regions + custom input
  - Model selection from curated list
- Persists api_mode=bedrock_converse in config.yaml

- Added boto3 to pyproject.toml

Bedrock models use cross-region inference profile prefixes:
- global. — works from any region
- eu. — EU cross-region routing
- us. — US cross-region routing
- bare — single-region direct access
@renlon

renlon commented Apr 14, 2026

Copy link
Copy Markdown

Thanks for working on Bedrock support — this is a much-needed feature. I've been working on an alternative implementation and wanted to share some observations that might be useful.

Core concern: Converse API vs Anthropic SDK

This PR uses the boto3 Converse API (bedrock-runtime.converse()), which is AWS's generic multi-model API. While this works for basic chat, it loses several Claude-specific features that the Anthropic SDK's AnthropicBedrock class preserves:

Feature Converse API (this PR) AnthropicBedrock SDK
Streaming ❌ Synchronous only (blocks until full response) ✅ Full streaming support
Prompt caching ❌ Not supported ✅ Same cache_control as native Anthropic
Reasoning/thinking ❌ Not exposed by Converse ✅ Full thinking block support
Adaptive thinking ❌ No ✅ Budget controls (low/medium/high/max)
Fast mode ❌ No ✅ Anthropic beta supported

For an agent framework where Claude is the primary model, losing streaming alone is a significant UX regression — users see nothing until the entire response is generated.

Alternative approach

The Anthropic Python SDK already ships with anthropic.AnthropicBedrock, which handles SigV4 signing internally and exposes the exact same messages.create() / messages.stream() interface as anthropic.Anthropic. This means the existing anthropic_messages API mode works as-is — no new message conversion, tool schema translation, or response normalization needed.

For bearer token auth (Bedrock API keys), AnthropicBedrock(api_key=bearer_token, aws_region=region) works directly.

This reduces the implementation from ~1,078 lines / 13 files to ~190 lines / 8 files (no new adapter file needed), with no new api_mode and no boto3 as a core dependency (only botocore at runtime, pulled in by the anthropic SDK's bedrock module).

Specific items

  1. boto3 as a core dependency (pyproject.toml) — adds ~15MB (boto3 + botocore + s3transfer) to every install, even for users who don't use Bedrock. If sticking with Converse, this should be an optional extra ([bedrock]).

  2. No streamingbedrock_client.converse() is synchronous. The converse_stream() variant exists but isn't used here, and even that returns a different event structure than what _run_agent_loop() expects.

  3. No prompt caching — The Converse API doesn't support Anthropic's cache_control breakpoints. For multi-turn agent conversations, this means ~4x higher input token costs on cached prefixes.

  4. Missing CLAUDE_CODE_USE_BEDROCK support — Claude Code users expect this env var to activate Bedrock. The PR uses AWS_BEDROCK_REGION instead of AWS_REGION and doesn't support ANTHROPIC_MODEL / ANTHROPIC_SMALL_FAST_MODEL / DISABLE_PROMPT_CACHING.

Suggestion

Consider switching the underlying transport from boto3.converse() to anthropic.AnthropicBedrock. The interactive setup wizard and curated model list from this PR are valuable and could be kept — the change is primarily in agent/bedrock_adapter.py (which could be removed entirely) and the client construction path.

Happy to discuss further or share code from the alternative implementation.

@alt-glitch alt-glitch added type/feature New feature or request P3 Low — cosmetic, nice to have provider/bedrock AWS Bedrock (boto3, IAM) comp/cli CLI entry point, hermes_cli/, setup wizard comp/agent Core agent loop, run_agent.py, prompt builder area/config Config system, migrations, profiles labels Apr 28, 2026
@alt-glitch

Copy link
Copy Markdown
Collaborator

Related to #12979 (open, triaged) which implements full Bedrock Converse API support with cross-region inference profiles. Also related to closed #6845 and #7920 which attempted the same feature.

@teknium1

Copy link
Copy Markdown
Contributor

Closing as superseded by #13814.

Triage notes (high confidence):
Bedrock is already a first-class inference provider on main: agent/bedrock_adapter.py, agent/transports/bedrock.py, plugins/model-providers/bedrock/, tests/agent/test_bedrock_*. Merged via #13814 'feat: add BedrockTransport + wire all Bedrock transport paths' (2026-04-22).

Thanks for the contribution — the underlying problem this PR addresses has been resolved by the linked PR on current main. If you believe this was closed in error, please comment and we'll reopen.

(Bulk-closed during a CLI PR triage sweep.)

@teknium1 teknium1 closed this May 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/config Config system, migrations, profiles comp/agent Core agent loop, run_agent.py, prompt builder comp/cli CLI entry point, hermes_cli/, setup wizard P3 Low — cosmetic, nice to have provider/bedrock AWS Bedrock (boto3, IAM) type/feature New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants