feat: official xAI/Grok model support with native 2M/1M context lengths#6238
Closed
Julientalbot wants to merge 1 commit into
Closed
feat: official xAI/Grok model support with native 2M/1M context lengths#6238Julientalbot wants to merge 1 commit into
Julientalbot wants to merge 1 commit into
Conversation
…1 with 1M) - Add accurate context lengths in DEFAULT_CONTEXT_LENGTHS - Register official xAI model IDs in hermes_cli/models.py - Add api.x.ai → xai provider mapping in _URL_TO_PROVIDER - Enables proper context window, reasoning flags, and custom provider routing This matches the models already migrated in Julien's environment (grok-4.20-0309-reasoning and grok-4-1-fast-reasoning).
Contributor
Author
|
made a mistake |
Contributor
Author
|
Self-review:
Happy to adjust the mapping style or add more test cases if desired. |
Julientalbot
pushed a commit
to Julientalbot/hermes-agent
that referenced
this pull request
Apr 10, 2026
xAI is listed in models.dev (`id: xai`, `env: [XAI_API_KEY]`) but without
an `api` field, so `get_provider("xai")` returned a ProviderDef with
an empty base_url. Users had to configure xAI via `provider: custom`
and manually set `base_url: https://api.x.ai/v1`, losing the ergonomics
of a first-class provider (no `hermes model --provider xai`, no
auto-detection from `XAI_API_KEY`, no label, etc.).
Add xAI as a native api_key provider alongside zai, kimi-coding, minimax,
deepseek, and friends:
- `hermes_cli/auth.py` — register PROVIDER_REGISTRY["xai"] with
inference_base_url=https://api.x.ai/v1, api_key_env_vars=(XAI_API_KEY,),
base_url_env_var=XAI_BASE_URL.
- `hermes_cli/providers.py` — add HERMES_OVERLAYS["xai"] with
base_url_override, register `x-ai` and `x.ai` as aliases, and add
"xAI" to the LABELS dict.
- `hermes_cli/models.py` — populate _PROVIDER_MODELS["xai"] with the
11 public Grok text models (grok-4.20-*, grok-4-1-fast-*, grok-4-fast-*,
grok-4, grok-4-0709, grok-code-fast-1, grok-3, grok-3-mini) so that
`hermes model --provider xai` and `curated_models_for_provider("xai")`
return a usable catalog.
- `agent/model_metadata.py` — add `api.x.ai` to _URL_TO_PROVIDER so that
users who keep the legacy `provider: custom` + `base_url: https://api.x.ai/v1`
configuration still get correct context-length resolution via models.dev
(`_infer_provider_from_url` resolves to "xai" and lookup_models_dev_context
returns the real 2M/256k/131k values per model).
- Tests: add `xai` to the existing TestProviderRegistry.test_provider_registered
parametrize, and add a dedicated test_xai_env_vars assertion pinning
the env var names and default base URL.
After this change, users on xAI direct can configure:
model:
default: grok-4.20-0309-reasoning
provider: xai
instead of the current workaround:
model:
default: grok-4.20-0309-reasoning
provider: custom
base_url: https://api.x.ai/v1
Auto-detection from XAI_API_KEY, `hermes model` catalog listing, and
credential-pool integration all work as they do for the other api_key
providers. Complements the existing xAI integration (`x-grok-conv-id`
prompt caching, `"grok"` in TOOL_USE_ENFORCEMENT_MODELS) and
the context-length fallbacks in NousResearch#7039. Supersedes the earlier draft
PR NousResearch#6238 which proposed a narrower and incomplete fix.
21/21 tests pass on TestProviderRegistry.
Contributor
Author
|
Superseded by #7050 (feat: native xAI provider) and #7039 (fix: xAI Grok context length fallbacks). Closing this draft because:
The combined pair of #7039 + #7050 delivers everything this PR intended plus first-class provider support. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Adds first-class official support for the xAI/Grok models currently used in production:
xai/grok-4.20-0309-reasoning→ 2M tokensxai/grok-4-1-fast-reasoning→ 1M tokensChanges
agent/model_metadata.pyapi.x.aigrok-4.20,grok-4, etc.)Motivation
Eliminates the need for manual
context_lengthoverrides inconfig.yaml. Makes behavior consistent with other official providers.Tested locally on the main instance, Ava profile, delegation, and fallback.
Validation
get_model_context_length()returns correct valuestest_model_metadata.pypass