Skip to content

feat(azure-foundry): add Azure AI Foundry provider with auto-detection#15845

Merged
teknium1 merged 9 commits into
mainfrom
hermes/hermes-ee57731c
Apr 26, 2026
Merged

feat(azure-foundry): add Azure AI Foundry provider with auto-detection#15845
teknium1 merged 9 commits into
mainfrom
hermes/hermes-ee57731c

Conversation

@teknium1

Copy link
Copy Markdown
Contributor

Summary

Azure AI Foundry (OpenAI-style and Anthropic-style endpoints) now works end-to-end. The setup wizard probes your endpoint, auto-detects the transport, prefills a picker with deployed model IDs, and resolves the context length automatically.

Salvages 4 stale Azure PRs onto current main with contributor authorship preserved. Closes: #9029 (TechPrototyper), #4599 (HangGlidersRule), #10086 (akhater), #8766 (pein892).

Changes

  • hermes_cli/azure_detect.py (new) — path sniff + /models probe + Anthropic Messages fallback, wraps existing get_model_context_length for context resolution.
  • hermes_cli/main.py::_model_flow_azure_foundry — rewritten wizard: URL → key → auto-probe → model picker.
  • hermes_cli/runtime_provider.py — new _resolve_azure_foundry_runtime helper reused from both explicit-args and top-level paths, fixes /v1 strip regression when only config.yaml is set.
  • run_agent.py_is_azure_openai_url() helper; gpt-5.x on openai.azure.com stays on chat_completions (no Responses API), uses max_completion_tokens, also applied to fallback-provider api_mode picker.
  • agent/anthropic_adapter.py — Azure endpoints: api-version=2025-04-15 passed via default_query (prevents malformed /anthropic?api-version=.../v1/messages). Also fixed a NameError from fix: Azure AI Foundry / Azure Anthropic endpoint compatibility #4599 where _is_azure_endpoint was referenced but never defined.
  • agent/auxiliary_client.py, run_agent.py_extract_url_query_params() so SDK URL-joining doesn't drop ?api-version=... from custom base URLs (benefits all custom endpoints, not just Azure).
  • Provider registry: azure-foundry added to PROVIDER_REGISTRY, CANONICAL_PROVIDERS, HERMES_OVERLAYS, _PROVIDER_MODELS, OPTIONAL_ENV_VARS (AZURE_FOUNDRY_API_KEY, AZURE_FOUNDRY_BASE_URL).
  • Docs: new website/docs/guides/azure-foundry.md, env var reference, CLI --provider choices, auxiliary provider list, sidebar entry.
  • scripts/release.py AUTHOR_MAP entries for TechPrototyper, HangGlidersRule (noreply form), pein892.

Auto-detection

Trigger Result
URL path ends in /anthropic anthropic_messages, no HTTP call
GET <base>/models → 200 + OpenAI shape chat_completions, picker prefilled with deployment IDs
Anthropic Messages probe succeeds anthropic_messages
Every probe fails Manual API-mode picker + free-text deployment name

Azure has no pure-API-key deployment-listing endpoint (Microsoft confirms: enumeration requires ARM management auth). Azure OpenAI v1 endpoints do return the resource's available-model catalog via GET /models, which is good enough for picker prefill in the common case.

Validation

Suite Result
tests/hermes_cli/test_azure_detect.py (new) 19/19 pass
tests/hermes_cli/test_runtime_provider_resolution.py::TestAzureFoundryResolution (new) 4/4 pass
tests/run_agent/test_run_agent.py::TestAzureOpenAIRouting (new) 3/3 pass
tests/run_agent/test_run_agent.py::TestMaxTokensParam 5/5 pass (1 new)
tests/hermes_cli/test_runtime_provider_resolution.py (full) 78/78 pass
Anthropic adapter suite (-k anthropic) 245/245 pass
Auxiliary client suite (-k auxiliary) 172/172 pass

Live E2E against a local HTTP server returning an OpenAI-shaped /models response: azure_detect.detect() correctly identified chat_completions, extracted 3 deployment IDs, and returned None on bad-key (wizard falls back to manual). Runtime resolver stripped /v1 from Anthropic-style URLs end-to-end with a real config.yaml + env var.

Contributors salvaged

Authorship preserved per-commit. Will merge with --rebase so each contributor's commit shows their name in git history.

What this does NOT include

#6616 ([SYSTEM:[IMPORTANT: to dodge Azure content filter) is intentionally not salvaged. That's an Azure content-filter config issue for the user to resolve, not something to work around globally by neutering a prompt tag every other user relies on.

TechPrototyper and others added 9 commits April 25, 2026 18:42
…tion

Add support for Azure Foundry as a new inference provider. Azure Foundry
endpoints can use either OpenAI-style (/v1/chat/completions) or
Anthropic-style (/v1/messages) API formats.

Changes:
- Add azure-foundry to PROVIDER_REGISTRY (auth.py)
- Add azure-foundry overlay in HERMES_OVERLAYS (providers.py)
- Add empty model list for azure-foundry (models.py)
- Add _model_flow_azure_foundry() interactive setup (main.py)
- Add azure-foundry runtime resolution with api_mode support (runtime_provider.py)
- Add AZURE_FOUNDRY_API_KEY and AZURE_FOUNDRY_BASE_URL env vars (config.py)

Usage:
  hermes model -> More providers -> Azure Foundry

The setup wizard prompts for:
- Endpoint URL
- API format (OpenAI or Anthropic-style)
- API key
- Model name

Configuration is saved to config.yaml (model.provider, model.base_url,
model.api_mode, model.default) and ~/.hermes/.env (AZURE_FOUNDRY_API_KEY).
…ss custom runtime when provider=anthropic + azure.com URL
…s ~/.claude/.credentials.json from overwriting Azure key mid-session
…as producing malformed URLs like /anthropic?api-version=.../v1/messages
Azure OpenAI requires an `api-version` query parameter on every request.
When users include it in the base_url (e.g. `?api-version=2025-04-01-preview`),
the OpenAI SDK silently drops it during URL construction, causing 404 errors.

Extract query params from base_url and pass them via `default_query` so the
SDK appends them to every request. This is a generic solution that works for
any custom endpoint requiring query parameters, not just Azure.

No-op for URLs without query params — fully backward compatible.
Azure OpenAI exposes an OpenAI-compatible endpoint at
`{resource}.openai.azure.com/openai/v1` that accepts the standard
`openai` Python client. Two issues prevented gpt-5.x models from working:

1. `_max_tokens_param()` only sent `max_completion_tokens` for
   `api.openai.com` URLs. Azure also requires `max_completion_tokens`
   for gpt-5.x models.

2. The `codex_responses` upgrade gate unconditionally upgraded gpt-5.x
   to Responses API. Azure does NOT support the Responses API — it serves
   gpt-5.x on the regular `/chat/completions` path, causing a 404.

Fix: add `_is_azure_openai_url()` that matches `openai.azure.com` URLs.
- `_max_tokens_param()` now returns `max_completion_tokens` for Azure.
- The `codex_responses` upgrade gate skips Azure so gpt-5.x stays on
  `chat_completions` where Azure actually serves it.
- The fallback-provider api_mode picker also recognises Azure and stays
  on chat_completions.
- Tests cover max_tokens routing, api_mode behaviour, and URL detection.

gpt-4.x models on Azure are unaffected (already used chat_completions +
max_tokens, which Azure accepts for those models).

Salvage of PR #10086 — rewritten against current main where the
codex_responses upgrade gate gained copilot-acp / explicit-api_mode
exclusions.
The azure-foundry wizard now probes the endpoint before asking the user
to pick anything by hand:

  1. URL path sniff — endpoints ending in /anthropic are Azure Foundry
     Claude routes and skip to anthropic_messages.
  2. GET <base>/models probe — if the endpoint returns an OpenAI-shaped
     model list, we switch to chat_completions and prefill the picker
     with the returned deployment/model IDs.
  3. Anthropic Messages probe — fallback for endpoints that don't expose
     /models but do speak the Anthropic Messages shape.
  4. Manual fallback — private endpoints / custom routes still work;
     the user picks API mode + types a deployment name.

Context length for the selected model is resolved through the existing
agent.model_metadata.get_model_context_length chain (models.dev,
provider metadata, hardcoded family fallbacks) and stored in
model.context_length when a non-default value is found.

Also refactors runtime_provider so Azure Foundry resolution is reused
between the explicit-credentials path and the default top-level path —
previously the /v1 strip for Anthropic-style Azure only ran when the
caller passed explicit_* args, which meant config-driven sessions
hit a double-/v1 URL.

New module hermes_cli/azure_detect.py with 19 unit tests covering:
- path sniff, model ID extraction, probe fallbacks
- HTTP error handling (URLError, HTTPError)
- context-length lookup passthrough
- DEFAULT_FALLBACK_CONTEXT rejection

New runtime tests cover:
- OpenAI-style Azure Foundry
- Anthropic-style Azure Foundry with /v1 stripping
- Missing base_url / API key raising AuthError

Rationale: Microsoft confirms there's no pure-API-key endpoint to list
Azure deployments (that requires ARM management auth).  The v1 Azure
OpenAI endpoint does expose /models with the resource's available
model catalog, which is good enough for picker prefill in the common
case.  Users on private/gated endpoints fall through to manual entry.
- New website/docs/guides/azure-foundry.md covering both OpenAI-style
  and Anthropic-style endpoints, auto-detection behaviour, gpt-5.x
  routing, /v1 stripping, api-version query forwarding, and the
  provider: anthropic + Azure URL alternative setup.
- environment-variables.md picks up AZURE_FOUNDRY_API_KEY,
  AZURE_FOUNDRY_BASE_URL, AZURE_ANTHROPIC_KEY.
- cli-commands.md includes azure-foundry in the provider choices list.
- configuration.md lists azure-foundry among auxiliary-task providers.
- sidebars.ts wires the new guide into the Guides section.
- scripts/release.py AUTHOR_MAP entries for TechPrototyper,
  HangGlidersRule (noreply), and pein892 so the contributor-attribution
  CI check does not reject the salvage.
@teknium1 teknium1 merged commit 7c50ed7 into main Apr 26, 2026
11 of 12 checks passed
@teknium1 teknium1 deleted the hermes/hermes-ee57731c branch April 26, 2026 01:48
@alt-glitch alt-glitch added type/feature New feature or request P2 Medium — degraded but workaround exists comp/agent Core agent loop, run_agent.py, prompt builder comp/cli CLI entry point, hermes_cli/, setup wizard area/config Config system, migrations, profiles labels Apr 26, 2026
turbo998 added a commit to turbo998/hermes-agent that referenced this pull request May 9, 2026
Closes parity gap between Azure Foundry (NousResearch#15845) and AWS Bedrock (NousResearch#10549).
Adds the operational polish layer Bedrock established as the bar:

- Azure AI Content Safety client + guardrails block (mirrors Bedrock shape)
- hermes doctor Azure Foundry section
- hermes auth Azure Foundry status row
- Error-classifier patterns: ResponsibleAIPolicyViolation, content_filter,
  Azure 429 retry-after, DeploymentNotFound
- /usage pricing for 7 Azure Foundry models the wizard prefills
- pyproject [azure] optional extra (provider works without it)
- Docs: Content Safety guardrails section in azure-foundry.md
- ~70 new tests across 6 files

Signed-off-by: Chen Qi <turbo998@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/config Config system, migrations, profiles comp/agent Core agent loop, run_agent.py, prompt builder comp/cli CLI entry point, hermes_cli/, setup wizard P2 Medium — degraded but workaround exists type/feature New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants