Skip to content

feat: native AWS Bedrock provider via Converse API#7920

Closed
JiaDe-Wu wants to merge 4 commits into
NousResearch:mainfrom
JiaDe-Wu:feat/native-aws-bedrock-provider
Closed

feat: native AWS Bedrock provider via Converse API#7920
JiaDe-Wu wants to merge 4 commits into
NousResearch:mainfrom
JiaDe-Wu:feat/native-aws-bedrock-provider

Conversation

@JiaDe-Wu

@JiaDe-Wu JiaDe-Wu commented Apr 11, 2026

Copy link
Copy Markdown
Contributor

feat: native AWS Bedrock provider via Converse API

Summary

Adds Amazon Bedrock as a first-class provider using the native Converse API (not the OpenAI-compatible endpoint). This gives Hermes users direct access to the full Bedrock ecosystem: IAM authentication, Guardrails, cross-region inference profiles, dynamic model discovery, and all foundation models — with zero API key management on AWS compute.

Reference implementation: OpenClaw's extensions/amazon-bedrock/ plugin, which implements the same Converse API integration in TypeScript via @aws-sdk/client-bedrock.

Motivation

Bedrock is the primary inference platform for AWS-native teams. The existing workaround (OpenAI-compatible endpoint via bedrock-mantle) requires managing Bedrock API keys and misses ecosystem features like Guardrails, inference profiles, and IAM role authentication. A native integration removes these friction points and serves the large AWS user base properly.

This PR was developed with reference to OpenClaw PR #62673, which addressed the same class of cloud credential passthrough issues for Bedrock on EC2/ECS/Lambda.

What's included

Core adapter (agent/bedrock_adapter.py — 1,020 lines)

  • Converse API integrationbuild_converse_kwargs(), call_converse(), call_converse_stream() with full message/tool format conversion between OpenAI and Converse formats
  • Real-time streamingstream_converse_with_callbacks() with on_text_delta, on_tool_start, on_reasoning_delta, and on_interrupt_check callbacks, matching the Anthropic/chat_completions streaming UX
  • AWS credential detection — env vars (ACCESS_KEY, PROFILE, BEARER_TOKEN, CONTAINER_CREDENTIALS, WEB_IDENTITY) + boto3 IMDS fallback for EC2 instance roles, ECS task roles, Lambda execution roles
  • Dynamic model discoverydiscover_bedrock_models() via ListFoundationModels + ListInferenceProfiles with 1-hour cache, provider filtering, and deduplication
  • Error classificationclassify_bedrock_error() for context overflow, throttling, and overload patterns
  • Context length tableBEDROCK_CONTEXT_LENGTHS with substring matching for versioned model IDs and inference profiles
  • Guardrails supportguardrail_config parameter threaded through the entire call chain

Provider registration (6 files)

  • hermes_cli/auth.pyProviderConfig with auth_type="aws_sdk", aliases (aws, aws-bedrock, amazon-bedrock, amazon), auto-detection via has_aws_credentials() in the resolve_provider("auto") path
  • hermes_cli/models.py_PROVIDER_MODELS["bedrock"] static fallback list, _PROVIDER_LABELS, _PROVIDER_ALIASES, Bedrock-aware model validation in validate_requested_model()
  • hermes_cli/providers.pyALIASES, TRANSPORT_TO_API_MODE["bedrock_converse"], _LABEL_OVERRIDES, URL heuristic in determine_api_mode()
  • hermes_cli/runtime_provider.py — Bedrock runtime resolution with region from config.yaml, guardrail config passthrough, explicit-vs-auto credential check distinction
  • hermes_cli/config.pyDEFAULT_CONFIG["bedrock"] section (region, discovery, guardrail), OPTIONAL_ENV_VARS for AWS_REGION/AWS_PROFILE, bedrock in fallback_model provider comments
  • hermes_cli/main.py_model_flow_bedrock() with smart model filtering (excludes embedding/image/voice models), deduplication (prefers inference profiles over bare foundation IDs), recommended model sorting, region persistence to config.yaml

Agent integration (run_agent.py)

  • api_mode whitelist updated to include "bedrock_converse"
  • Auto-detection from provider="bedrock" or bedrock-runtime in base URL
  • _build_api_kwargs() — Bedrock branch builds Converse-format kwargs directly
  • _interruptible_api_call() — Bedrock branch calls client.converse() via boto3
  • _interruptible_streaming_api_call() — Full streaming with converse_stream() + delta callbacks in a background thread with interrupt support
  • run_conversation() — Length truncation handling extended to bedrock_converse api_mode
  • Guardrail config loaded from config.yaml at init time

Ecosystem integration (5 files)

  • agent/error_classifier.py — Bedrock-specific patterns for ThrottlingException, ServiceQuotaExceededException, ModelNotReadyException, ModelTimeoutException, context overflow ValidationException
  • agent/model_metadata.py — Bedrock context length resolution step (after Anthropic, before models.dev)
  • agent/usage_pricing.py — Bedrock model pricing (Claude Opus/Sonnet/Haiku, Nova Pro/Lite/Micro) with substring matching for inference profile IDs
  • hermes_cli/doctor.py — Bedrock diagnostics: AWS credential detection, boto3 installation check, ListFoundationModels API health check with model count
  • hermes_cli/auth_commands.pyhermes auth displays AWS credential source, region, and IAM identity via sts:GetCallerIdentity

Documentation

  • website/docs/guides/aws-bedrock.md — Full guide: prerequisites, quick start, config.yaml reference, Guardrails, model discovery, available models, /model switching, diagnostics, gateway, troubleshooting
  • website/docs/getting-started/quickstart.md — Bedrock added to provider table
  • BEDROCK_TESTING_GUIDE.md — Manual testing guide for reviewers

Packaging

  • pyproject.tomlbedrock = ["boto3>=1.35.0,<2"] optional dependency, added to [all] extra

Testing

Automated tests — 107 tests, all passing

tests/agent/test_bedrock_adapter.py (79 tests):

  • TestResolveAwsAuthEnvVar (8) — credential detection priority, IMDS fallback, whitespace handling
  • TestHasAwsCredentials (2) — positive/negative detection
  • TestResolveBedrocRegion (3) — region resolution priority
  • TestConvertToolsToConverse (4) — OpenAI → Converse tool format
  • TestConvertMessagesToConverse (10) — system prompt extraction, tool calls, tool results, role alternation merging, image content, empty content placeholders
  • TestNormalizeConverseResponse (6) — text, tool_use, multiple tools, stop reason mapping, empty content, tool_calls finish_reason override
  • TestNormalizeConverseStreamEvents (4) — text stream, tool stream, mixed, empty
  • TestStreamConverseWithCallbacks (5) — text delta callbacks, tool_use suppression, tool_start callback, interrupt, reasoning delta
  • TestBuildConverseKwargs (6) — basic, tools, temperature/top_p, guardrail, no-system, no-tools
  • TestDiscoverBedrockModels (8) — foundation models, inactive filter, non-streaming filter, provider filter, caching, inference profiles, global sort, API error handling
  • TestGuardrailConfig (3) — included, none, empty dict
  • TestBedrockErrorClassification (8) — context overflow (3 patterns), rate limit (2), overloaded (2), unknown
  • TestBedrockContextLength (7) — Claude, Nova, unknown default, inference profile, longest prefix
  • TestExtractProviderFromArn (3) + TestClientCache (1)

tests/agent/test_bedrock_integration.py (28 tests):

  • TestProviderRegistry (4) — registration, auth_type, env vars, base_url_env_var
  • TestProviderAliases (4) — aws, aws-bedrock, amazon-bedrock, amazon
  • TestProviderLabels (1) + TestModelCatalog (3)
  • TestResolveProvider (4) — explicit, aliases, auto-detect with AWS credentials
  • TestRuntimeProvider (4) — resolution, default region, no-credentials-on-auto-detect raises, explicit-skips-check
  • TestProvidersModule (4) — aliases, transport mapping, URL heuristic, label override
  • TestErrorClassifierBedrock (2) — throttling, context overflow patterns
  • TestPackaging (2) — bedrock extra exists, in [all] extra

EC2 integration testing

All tests run on EC2 t3.medium (Amazon Linux 2023, Python 3.11.14, boto3 1.42.88) with IAM role hermes-bedrock-manual-test (AmazonBedrockFullAccess) in us-east-2.

Multi-model end-to-end via AIAgent.chat():

Model ID Response Time
Claude Sonnet 4.6 us.anthropic.claude-sonnet-4-6 "Four" 1.3s
Amazon Nova Pro us.amazon.nova-pro-v1:0 "4" 0.8s
DeepSeek V3.2 deepseek.v3.2 "4" 0.7s
Llama 4 Scout 17B us.meta.llama4-scout-17b-instruct-v1:0 "Four." 0.5s

Streaming verification:

  • Claude Sonnet 4.6: 25 chunks, 21→80 tokens, 2.1s — text deltas fired correctly

Tool calling verification:

  • Claude Sonnet 4.6: get_weather({"city": "Tokyo"}) — tool_call ID, name, arguments all correct, finish_reason=tool_calls

Guardrails verification:

  • Created guardrail zi4q4hqa4n4k (SEXUAL+HATE content filter) — requests with guardrailConfig succeed, guardrail parameters correctly injected into Converse API calls

CLI verification:

  • hermes model → Bedrock selection → region picker → smart model list (filtered, deduplicated, recommended first)
  • hermes chat → interactive conversation with streaming output
  • /model us.amazon.nova-pro-v1:0 → mid-session model switch, validated via Bedrock discovery
  • /usage → token count and cost display
  • hermes doctor✓ AWS Bedrock (iam-role, us-east-2, 89 models)
  • hermes auth → displays IAM role ARN, region, identity

Gateway verification:

  • Feishu (Lark) gateway configured and tested — Bedrock provider works through the messaging gateway path with the same config.yaml configuration

IMDS credential detection:

  • EC2 instance role detected via boto3 botocore.session.get_session().get_credentials() fallback — no environment variables needed. This addresses the same class of issue as OpenClaw PR #62673.

Files changed

New files (3)

File Lines Description
agent/bedrock_adapter.py 1,020 Core Bedrock Converse API adapter
tests/agent/test_bedrock_adapter.py 1,100 Unit tests (79 tests)
tests/agent/test_bedrock_integration.py 280 Integration tests (28 tests)

Modified files (13)

File Description
run_agent.py api_mode whitelist, init, _build_api_kwargs, streaming, non-streaming, length truncation
hermes_cli/auth.py PROVIDER_REGISTRY entry, aliases, auto-detect
hermes_cli/models.py Labels, aliases, model catalog, validate_requested_model
hermes_cli/providers.py ALIASES, TRANSPORT_TO_API_MODE, URL heuristic, label
hermes_cli/config.py DEFAULT_CONFIG bedrock section, OPTIONAL_ENV_VARS, fallback comments
hermes_cli/main.py _model_flow_bedrock, provider picker, provider labels
hermes_cli/runtime_provider.py Bedrock runtime resolution with guardrails
hermes_cli/doctor.py Bedrock diagnostics
hermes_cli/auth_commands.py AWS credential status display
agent/error_classifier.py Bedrock error patterns
agent/model_metadata.py Bedrock context length resolution
agent/usage_pricing.py Bedrock model pricing + substring matching
pyproject.toml [bedrock] optional dependency

Documentation (2)

File Description
website/docs/guides/aws-bedrock.md Full Bedrock provider guide
website/docs/getting-started/quickstart.md Bedrock in provider table

JiaDe-Wu added a commit to JiaDe-Wu/sample-hermes-agent-on-aws-with-bedrock that referenced this pull request Apr 11, 2026
…loyment

Deploy Hermes Agent on AWS using native Bedrock Converse API.
- CloudFormation template with VPC, IAM, SSM, optional VPC Endpoints
- Supports Claude, Nova, DeepSeek, Llama via cross-region inference profiles
- Zero API key management — IAM role authentication
- Tested end-to-end: 4 models, streaming, tool calling, gateway

Companion to hermes-agent PR #7920:
NousResearch/hermes-agent#7920
@JiaDe-Wu

Copy link
Copy Markdown
Contributor Author

AWS Deployment Template & Related Resources

This PR now has a companion one-click AWS deployment project:

sample-hermes-agent-on-aws-with-bedrock — CloudFormation template that deploys Hermes Agent on EC2 with native Bedrock integration. Same approach as OpenClaw on AWS with Bedrock (which was accepted into aws-samples).

Resolves #3863 (Native AWS Bedrock provider support)

End-to-end CloudFormation test results

CloudFormation stack deployed in us-east-2 (~5 min). After SSM connect:

Test Result
hermes --version Hermes Agent v0.8.0, Python 3.12.3
hermes doctor AWS Bedrock (iam-role, us-east-2, 126 models)
Claude Sonnet 4.6 streaming response, 1.1s
Amazon Nova Pro 0.4s
DeepSeek V3.2 0.7s
Llama 4 Scout 17B 0.2s

The deployment template currently pins to this PR branch. Once merged, it switches to official PyPI.

@JiaDe-Wu JiaDe-Wu force-pushed the feat/native-aws-bedrock-provider branch from 88a7020 to 4c260a8 Compare April 12, 2026 14:21
Add Amazon Bedrock as a first-class provider using the native Converse API
(not the OpenAI-compatible endpoint). This gives Hermes users direct access
to the full Bedrock ecosystem: IAM authentication, Guardrails, cross-region
inference profiles, dynamic model discovery, and all foundation models.

Core changes:
- agent/bedrock_adapter.py: Converse API adapter with streaming, tool calling,
  error classification, model discovery, context lengths, and Guardrails
- Provider registration across auth.py, models.py, providers.py, config.py
- Agent integration in run_agent.py (init, kwargs, streaming, non-streaming)
- hermes doctor diagnostics, hermes auth AWS credential display
- usage_pricing.py Bedrock model pricing
- error_classifier.py Bedrock-specific error patterns
- model_metadata.py Bedrock context length resolution
- Documentation: guides/aws-bedrock.md, quickstart provider table
- pyproject.toml: [bedrock] optional dependency (boto3)

Tested on EC2 (Python 3.11.14, boto3 1.42.88, IAM role) with:
- 107 automated tests (all passing)
- 4 models end-to-end (Claude Sonnet 4.6, Nova Pro, DeepSeek V3.2, Llama 4)
- Streaming, tool calling, Guardrails, /model switch, hermes doctor
- Feishu gateway integration verified

Reference: OpenClaw extensions/amazon-bedrock/ plugin
Users without AWS accounts can now use a Bedrock API Key from their admin.
hermes model -> Bedrock -> choose "Bedrock API Key" -> enter key -> chat.

Uses the OpenAI-compatible bedrock-mantle endpoint, no boto3 needed.
Existing IAM credential chain (Converse API) is unchanged.
Foundation model IDs (anthropic.claude-opus-4-6-20250514-v1:0) fail with
ValidationException on Bedrock — most models now require inference profile
IDs (us.anthropic.claude-opus-4-6-v1) for on-demand invocation.
JiaDe-Wu added a commit to JiaDe-Wu/sample-host-hermesagent-on-amazon-bedrock-agentcore that referenced this pull request Apr 14, 2026
… provider

Replace the anthropic.Anthropic -> AnthropicBedrock monkey-patch with
Hermes Agent native Bedrock provider (provider="bedrock"), which uses
the Converse API directly via boto3.

Benefits:
- Supports ALL Bedrock models (Claude, Nova, DeepSeek, Llama, Mistral)
  not just Claude (Anthropic SDK limitation)
- No monkey-patching — clean provider="bedrock" configuration
- Native streaming, tool calling, error classification, Guardrails
- Dynamic model discovery via ListFoundationModels

Changes:
- app/hermes/main.py: Delete monkey-patch, use provider="bedrock"
- bridge/contract.py: Default provider anthropic -> bedrock
- bridge/bedrock_provider.py: Simplified (Converse API handles routing)
- bridge/Dockerfile: Add [bedrock] extra for boto3
- cdk.json: Use inference profile ID as default model

Depends on: NousResearch/hermes-agent#7920 (native Bedrock provider PR)
Ref: https://github.com/JiaDe-Wu/sample-hermes-agent-on-aws-with-bedrock
@ptlally

ptlally commented Apr 14, 2026

Copy link
Copy Markdown

Great work here guys! Quick note from testing: models that don’t support tool/function calling can pass discovery but currently cause a validation error -> retry loop -> failure (e.g. DeepSeek R1 via Bedrock).

Since Hermes is fundamentally tool-driven, this ends up being a pretty rough failure mode in practice.

In @richin13’s PR (#4346 ) we added a simple workaround where we catch that case, warn the user, strip tools, and continue. Not ideal long-term, but it avoids the hard failure.

Given that tool support doesn’t seem reliably exposed via the Bedrock APIs, this probably needs to be handled at runtime rather than during discovery.

Happy to port something similar here, or switch to failing fast with a clearer error, if that’s useful.

@renlon

renlon commented Apr 14, 2026

Copy link
Copy Markdown

This is a thorough and well-tested implementation — the 107 tests, EC2 integration testing, documentation, Guardrails support, dynamic model discovery, and diagnostics are impressive. A few observations comparing the Converse API approach to using the Anthropic SDK's AnthropicBedrock class directly:

Streaming: addressed ✅

Unlike PR #8832, this PR implements converse_stream() with delta callbacks, which solves the biggest UX issue. The streaming architecture with on_text_delta/on_tool_start/on_interrupt_check looks solid.

Prompt caching: still missing

The Converse API doesn't support Anthropic's cache_control breakpoints. For multi-turn agent conversations with large system prompts + tool schemas, this means paying full input token cost on every turn instead of ~25% on cache hits. On a 30-turn conversation with a 10K token system prompt, that's roughly 270K wasted input tokens.

AnthropicBedrock inherits the full messages.create() interface including prompt caching — no adapter needed, it just works.

Reasoning/thinking: partially addressed

The PR surfaces reasoningContent from Converse stream events, which is good. However, the Converse API doesn't support Anthropic's thinking configuration parameters (thinking.type, thinking.budget_tokens). With AnthropicBedrock, you get full control over adaptive thinking budgets (low/medium/high/max) via the same thinking parameter as the native API.

Code size comparison

This PR (Converse) AnthropicBedrock approach
New adapter file 1,020 lines 0 (reuses anthropic_adapter.py)
Total additions 3,312 lines ~190 lines
New api_mode bedrock_converse Reuses anthropic_messages
Message conversion Full OpenAI→Converse translation None needed
Tool schema translation Full translation layer None needed
Response normalization Full Converse→OpenAI mapping None needed

The Converse API adapter (message conversion, tool translation, response normalization, streaming event processing) accounts for ~1,000 lines. With AnthropicBedrock, all of this is handled by the Anthropic SDK — the adapter file isn't needed at all because AnthropicBedrock.messages.create() accepts the exact same parameters as Anthropic.messages.create().

Missing Claude Code compatibility

Neither this PR nor #8832 supports the Claude Code env var conventions that many Bedrock users already have configured:

  • CLAUDE_CODE_USE_BEDROCK=1 — activation flag
  • ANTHROPIC_MODEL — primary model override
  • ANTHROPIC_SMALL_FAST_MODEL — auxiliary model
  • DISABLE_PROMPT_CACHING — cache toggle

Suggestion

A hybrid approach might be ideal: use AnthropicBedrock as the default transport for Claude models on Bedrock (preserving streaming, prompt caching, thinking budgets) and fall back to the Converse API only for non-Anthropic models (Nova, Llama, DeepSeek). This would get the best of both worlds — Claude features preserved, multi-model support retained. The model discovery, Guardrails, diagnostics, and testing infrastructure from this PR would carry over directly.

@kabo

kabo commented Apr 15, 2026

Copy link
Copy Markdown

As this is something the community is wanting and is sitting and waiting for, could we get something, an MVP, out sooner rather than later, and do all the little improvements separately? Iterative approach, faster feedback from actual users, ensure we're addressing the important issues and not theoretical what-ifs. Beta launch if we're not sure? Just my 2 cents :)

@renlon

renlon commented Apr 15, 2026

Copy link
Copy Markdown

I am running tests now, will get a MVP PR out shortly

…AnthropicBedrock SDK (prompt caching, thinking budgets)\nNon-Claude models -> Converse API (Nova, DeepSeek, Llama, Mistral)\n\nAlso fixes:\n- Non-tool-calling models (DeepSeek R1) no longer crash (ptlally feedback)\n- Empty text blocks filtered (issue NousResearch#9486)\n- 130 tests (was 107)
@JiaDe-Wu

Copy link
Copy Markdown
Contributor Author

Updated PR with dual-path architecture + bug fixes:

  1. Claude on Bedrock -> AnthropicBedrock SDK (prompt caching, thinking budgets, adaptive thinking). Non-Claude -> Converse API. Automatic routing by model ID.
  2. Non-tool-calling models (DeepSeek R1) no longer crash - tools stripped with warning (ptlally feedback).
  3. Empty text blocks filtered (issue 9486).

130 tests, all passing on EC2.

teknium1 pushed a commit that referenced this pull request Apr 15, 2026
Salvaged from PR #7920 by JiaDe-Wu — cherry-picked Bedrock-specific
additions onto current main, skipping stale-branch reverts (293 commits
behind).

Dual-path architecture:
  - Claude models → AnthropicBedrock SDK (prompt caching, thinking budgets)
  - Non-Claude models → Converse API via boto3 (Nova, DeepSeek, Llama, Mistral)

Includes:
  - Core adapter (agent/bedrock_adapter.py, 1098 lines)
  - Full provider registration (auth, models, providers, config, runtime, main)
  - IAM credential chain + Bedrock API Key auth modes
  - Dynamic model discovery via ListFoundationModels + ListInferenceProfiles
  - Streaming with delta callbacks, error classification, guardrails
  - hermes doctor + hermes auth integration
  - /usage pricing for 7 Bedrock models
  - 130 automated tests (79 unit + 28 integration + follow-up fixes)
  - Documentation (website/docs/guides/aws-bedrock.md)
  - boto3 optional dependency (pip install hermes-agent[bedrock])

Co-authored-by: JiaDe WU <40445668+JiaDe-Wu@users.noreply.github.com>
teknium1 pushed a commit that referenced this pull request Apr 15, 2026
Salvaged from PR #7920 by JiaDe-Wu — cherry-picked Bedrock-specific
additions onto current main, skipping stale-branch reverts (293 commits
behind).

Dual-path architecture:
  - Claude models → AnthropicBedrock SDK (prompt caching, thinking budgets)
  - Non-Claude models → Converse API via boto3 (Nova, DeepSeek, Llama, Mistral)

Includes:
  - Core adapter (agent/bedrock_adapter.py, 1098 lines)
  - Full provider registration (auth, models, providers, config, runtime, main)
  - IAM credential chain + Bedrock API Key auth modes
  - Dynamic model discovery via ListFoundationModels + ListInferenceProfiles
  - Streaming with delta callbacks, error classification, guardrails
  - hermes doctor + hermes auth integration
  - /usage pricing for 7 Bedrock models
  - 130 automated tests (79 unit + 28 integration + follow-up fixes)
  - Documentation (website/docs/guides/aws-bedrock.md)
  - boto3 optional dependency (pip install hermes-agent[bedrock])

Co-authored-by: JiaDe WU <40445668+JiaDe-Wu@users.noreply.github.com>
@teknium1

Copy link
Copy Markdown
Contributor

Merged via PR #10549. Your Bedrock implementation was cherry-picked onto current main (your branch was 293 commits behind) with your authorship preserved in git log. The dual-path architecture, Converse API adapter, model discovery, guardrails, streaming, tests, and documentation all made it in. Two test assertions were adjusted to match our canonical provider naming convention (bedrock instead of amazon-bedrock). Excellent work — thank you @JiaDe-Wu!

Also thanks to @hheydaroff for the earlier PR #6845 which helped inform the direction.

@teknium1 teknium1 closed this Apr 15, 2026
kagura-agent pushed a commit to kagura-agent/hermes-agent that referenced this pull request Apr 16, 2026
Salvaged from PR NousResearch#7920 by JiaDe-Wu — cherry-picked Bedrock-specific
additions onto current main, skipping stale-branch reverts (293 commits
behind).

Dual-path architecture:
  - Claude models → AnthropicBedrock SDK (prompt caching, thinking budgets)
  - Non-Claude models → Converse API via boto3 (Nova, DeepSeek, Llama, Mistral)

Includes:
  - Core adapter (agent/bedrock_adapter.py, 1098 lines)
  - Full provider registration (auth, models, providers, config, runtime, main)
  - IAM credential chain + Bedrock API Key auth modes
  - Dynamic model discovery via ListFoundationModels + ListInferenceProfiles
  - Streaming with delta callbacks, error classification, guardrails
  - hermes doctor + hermes auth integration
  - /usage pricing for 7 Bedrock models
  - 130 automated tests (79 unit + 28 integration + follow-up fixes)
  - Documentation (website/docs/guides/aws-bedrock.md)
  - boto3 optional dependency (pip install hermes-agent[bedrock])

Co-authored-by: JiaDe WU <40445668+JiaDe-Wu@users.noreply.github.com>
@JiaDe-Wu

Copy link
Copy Markdown
Contributor Author

@teknium1 Thank you for cherry-picking this and preserving the authorship — really appreciate it.

This was a fun one to build. Hermes is the first open-source agent I have seen that genuinely learns and improves across sessions, and getting it to work natively on AWS infrastructure felt like the right thing to do for the community. The Converse API adapter, the model discovery, the streaming callbacks — every piece was built because real AWS users need it.

Already submitted a small follow-up PR (#11093) fixing a few docs issues from the cherry-pick. Happy to keep contributing — there is more to do on the Bedrock side (dual-path AnthropicBedrock for Claude-specific features, cross-region failover, etc.).

Thanks to @ptlally, @renlon, @kabo, @hheydaroff, and everyone who tested and gave feedback. Open source at its best.

@ptlally

ptlally commented Apr 16, 2026

Copy link
Copy Markdown

Indeed, @JiaDe-Wu , this was great work by you and all others involved, I'm stoked to see it moving across the finish line.

I also submitted a small follow-up PR (#10772) that fixed a minor issue where anthropic models on bedrock were having their IDs modified by normalization (dots to dashes) when they should have been passed through unchanged. Another small one that hopefully can be merged as-is.

Separately, I'm looking into an issue where using inference profiles appear to have input/output modalities defaulted to text only. Working on a fix where we try to detect it from the foundation model information that's already retrieved in discovery, but still a work in progress.

@ptlally

ptlally commented Apr 16, 2026

Copy link
Copy Markdown

Ah I actually might be mistaken @JiaDe-Wu , maybe you could confirm. Are the input/output modalities in discover_bedrock_models() only used for model filtering? May have misinterpreted that as something that tells the loop what kind of requests can/cannot be processed through a certain model.

JiaDe-Wu added a commit to JiaDe-Wu/hermes-agent that referenced this pull request Apr 16, 2026
… profiles\n\nInference profiles (us.*, global.*) were hardcoded to TEXT-only input/output\nmodalities. Claude profiles support IMAGE input but this was not reflected\nin discovery results, potentially causing vision features to be excluded.\n\nNow inherits modalities from the underlying foundation model via ARN\nlookup, matching OpenClaw resolveInferenceProfiles() pattern.\n\nRef: PR NousResearch#7920 feedback from @ptlally\n132 tests passing.
@JiaDe-Wu

Copy link
Copy Markdown
Contributor Author

@ptlally Good catch, and you're right to question it. The modalities from discover_bedrock_models() are currently used for filtering in _model_flow_bedrock() (the /model command's model picker) — so it's not breaking inference itself, but it does affect which models show up when filtering by capability. For example, if someone filters for vision-capable models, Claude inference profiles would be excluded even though they support IMAGE input.

I just opened #11132 to fix this — it builds a foundation model lookup map and inherits modalities via ARN resolution when processing inference profiles. Small change, 2 new tests.

Also saw your #10772 for the ID normalization fix — nice find, that's a subtle one.

JiaDe-Wu pushed a commit to JiaDe-Wu/hermes-agent that referenced this pull request Apr 17, 2026
list_authenticated_providers() only checked API keys and auth stores,
missing Bedrock's aws_sdk auth type entirely. The /model command (no args)
would show 'No authenticated providers found' even with valid IAM
credentials (instance role, SSO, env vars).

Add has_aws_credentials() check in the CANONICAL_PROVIDERS loop so
Bedrock appears in the provider picker when AWS credentials are available.

2 new tests for detection/non-detection.

Ref: PR NousResearch#7920 colleague feedback
JiaDe-Wu pushed a commit to JiaDe-Wu/hermes-agent that referenced this pull request Apr 17, 2026
Bedrock's aws_sdk auth type was not handled by resolve_provider_client(),
causing 'No auxiliary LLM provider configured' warning on startup. Context
compression would fall back to dropping middle turns without summaries.

Changes:
- auxiliary_client.py: Add aws_sdk auth branch using AnthropicBedrock SDK
- auxiliary_client.py: Add preserve_dots param to _AnthropicCompletionsAdapter
  and AnthropicAuxiliaryClient (Bedrock model IDs use dots as separators)
- auxiliary_client.py: Register bedrock default aux model (Claude Haiku 4.5)
- 3 new tests for resolution, preserve_dots, and graceful failure

Ref: PR NousResearch#7920 colleague feedback
yetisnowman added a commit to yetisnowman/hermes-agent-fork that referenced this pull request Apr 17, 2026
…edrock model names

The bedrock provider was added in NousResearch#7920 but two edge cases were missed:

1. auxiliary_client.py resolve_provider_client() had no bedrock branch,
   so configuring auxiliary LLM with provider: bedrock produced a warning
   and no LLM-based compression/summarization.

2. _anthropic_preserve_dots() whitelist did not include 'bedrock'.
   Bedrock model IDs contain dots (e.g. us.anthropic.claude-opus-4-6-v1)
   which normalize_model_name() was converting to hyphens, producing
   invalid model identifiers and HTTP 400 errors.

Fixes both by adding a bedrock branch to resolve_provider_client() that
reuses build_anthropic_bedrock_client() with IAM auth, and adding
'bedrock' to the preserve_dots provider set.
ulasbilgen pushed a commit to ulasbilgen/hermes-adhd-agent that referenced this pull request May 1, 2026
Salvaged from PR NousResearch#7920 by JiaDe-Wu — cherry-picked Bedrock-specific
additions onto current main, skipping stale-branch reverts (293 commits
behind).

Dual-path architecture:
  - Claude models → AnthropicBedrock SDK (prompt caching, thinking budgets)
  - Non-Claude models → Converse API via boto3 (Nova, DeepSeek, Llama, Mistral)

Includes:
  - Core adapter (agent/bedrock_adapter.py, 1098 lines)
  - Full provider registration (auth, models, providers, config, runtime, main)
  - IAM credential chain + Bedrock API Key auth modes
  - Dynamic model discovery via ListFoundationModels + ListInferenceProfiles
  - Streaming with delta callbacks, error classification, guardrails
  - hermes doctor + hermes auth integration
  - /usage pricing for 7 Bedrock models
  - 130 automated tests (79 unit + 28 integration + follow-up fixes)
  - Documentation (website/docs/guides/aws-bedrock.md)
  - boto3 optional dependency (pip install hermes-agent[bedrock])

Co-authored-by: JiaDe WU <40445668+JiaDe-Wu@users.noreply.github.com>
aj-nt pushed a commit to aj-nt/hermes-agent that referenced this pull request May 1, 2026
Salvaged from PR NousResearch#7920 by JiaDe-Wu — cherry-picked Bedrock-specific
additions onto current main, skipping stale-branch reverts (293 commits
behind).

Dual-path architecture:
  - Claude models → AnthropicBedrock SDK (prompt caching, thinking budgets)
  - Non-Claude models → Converse API via boto3 (Nova, DeepSeek, Llama, Mistral)

Includes:
  - Core adapter (agent/bedrock_adapter.py, 1098 lines)
  - Full provider registration (auth, models, providers, config, runtime, main)
  - IAM credential chain + Bedrock API Key auth modes
  - Dynamic model discovery via ListFoundationModels + ListInferenceProfiles
  - Streaming with delta callbacks, error classification, guardrails
  - hermes doctor + hermes auth integration
  - /usage pricing for 7 Bedrock models
  - 130 automated tests (79 unit + 28 integration + follow-up fixes)
  - Documentation (website/docs/guides/aws-bedrock.md)
  - boto3 optional dependency (pip install hermes-agent[bedrock])

Co-authored-by: JiaDe WU <40445668+JiaDe-Wu@users.noreply.github.com>
02356abc pushed a commit to 02356abc/hermes-agent that referenced this pull request May 14, 2026
Salvaged from PR NousResearch#7920 by JiaDe-Wu — cherry-picked Bedrock-specific
additions onto current main, skipping stale-branch reverts (293 commits
behind).

Dual-path architecture:
  - Claude models → AnthropicBedrock SDK (prompt caching, thinking budgets)
  - Non-Claude models → Converse API via boto3 (Nova, DeepSeek, Llama, Mistral)

Includes:
  - Core adapter (agent/bedrock_adapter.py, 1098 lines)
  - Full provider registration (auth, models, providers, config, runtime, main)
  - IAM credential chain + Bedrock API Key auth modes
  - Dynamic model discovery via ListFoundationModels + ListInferenceProfiles
  - Streaming with delta callbacks, error classification, guardrails
  - hermes doctor + hermes auth integration
  - /usage pricing for 7 Bedrock models
  - 130 automated tests (79 unit + 28 integration + follow-up fixes)
  - Documentation (website/docs/guides/aws-bedrock.md)
  - boto3 optional dependency (pip install hermes-agent[bedrock])

Co-authored-by: JiaDe WU <40445668+JiaDe-Wu@users.noreply.github.com>
@JiaDe-Wu

Copy link
Copy Markdown
Contributor Author

The inference-profile modality fix is now in a fresh PR rebased on main: #34359

gweeteve pushed a commit to gweeteve/hermes-agent that referenced this pull request Jun 2, 2026
Salvaged from PR NousResearch#7920 by JiaDe-Wu — cherry-picked Bedrock-specific
additions onto current main, skipping stale-branch reverts (293 commits
behind).

Dual-path architecture:
  - Claude models → AnthropicBedrock SDK (prompt caching, thinking budgets)
  - Non-Claude models → Converse API via boto3 (Nova, DeepSeek, Llama, Mistral)

Includes:
  - Core adapter (agent/bedrock_adapter.py, 1098 lines)
  - Full provider registration (auth, models, providers, config, runtime, main)
  - IAM credential chain + Bedrock API Key auth modes
  - Dynamic model discovery via ListFoundationModels + ListInferenceProfiles
  - Streaming with delta callbacks, error classification, guardrails
  - hermes doctor + hermes auth integration
  - /usage pricing for 7 Bedrock models
  - 130 automated tests (79 unit + 28 integration + follow-up fixes)
  - Documentation (website/docs/guides/aws-bedrock.md)
  - boto3 optional dependency (pip install hermes-agent[bedrock])

Co-authored-by: JiaDe WU <40445668+JiaDe-Wu@users.noreply.github.com>
Egavasyug pushed a commit to Egavasyug/hermes-agent that referenced this pull request Jun 10, 2026
Salvaged from PR NousResearch#7920 by JiaDe-Wu — cherry-picked Bedrock-specific
additions onto current main, skipping stale-branch reverts (293 commits
behind).

Dual-path architecture:
  - Claude models → AnthropicBedrock SDK (prompt caching, thinking budgets)
  - Non-Claude models → Converse API via boto3 (Nova, DeepSeek, Llama, Mistral)

Includes:
  - Core adapter (agent/bedrock_adapter.py, 1098 lines)
  - Full provider registration (auth, models, providers, config, runtime, main)
  - IAM credential chain + Bedrock API Key auth modes
  - Dynamic model discovery via ListFoundationModels + ListInferenceProfiles
  - Streaming with delta callbacks, error classification, guardrails
  - hermes doctor + hermes auth integration
  - /usage pricing for 7 Bedrock models
  - 130 automated tests (79 unit + 28 integration + follow-up fixes)
  - Documentation (website/docs/guides/aws-bedrock.md)
  - boto3 optional dependency (pip install hermes-agent[bedrock])

Co-authored-by: JiaDe WU <40445668+JiaDe-Wu@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants