Skip to content

feat: official xAI/Grok model support with native 2M/1M context lengths#6238

Closed
Julientalbot wants to merge 1 commit into
NousResearch:mainfrom
Julientalbot:feat/xai-official-model-support
Closed

feat: official xAI/Grok model support with native 2M/1M context lengths#6238
Julientalbot wants to merge 1 commit into
NousResearch:mainfrom
Julientalbot:feat/xai-official-model-support

Conversation

@Julientalbot

@Julientalbot Julientalbot commented Apr 8, 2026

Copy link
Copy Markdown
Contributor

Description

Adds first-class official support for the xAI/Grok models currently used in production:

  • xai/grok-4.20-0309-reasoning2M tokens
  • xai/grok-4-1-fast-reasoning1M tokens

Changes

  • Added context length mappings in agent/model_metadata.py
  • Improved provider detection for api.x.ai
  • Cleaned up aliases and model name normalization (grok-4.20, grok-4, etc.)
  • Updated tests and comments

Motivation

Eliminates the need for manual context_length overrides in config.yaml. Makes behavior consistent with other official providers.

Tested locally on the main instance, Ava profile, delegation, and fallback.

Validation

  • get_model_context_length() returns correct values
  • All 75 tests in test_model_metadata.py pass
  • Verified in real conditions (2M context correctly detected)

…1 with 1M)

- Add accurate context lengths in DEFAULT_CONTEXT_LENGTHS
- Register official xAI model IDs in hermes_cli/models.py
- Add api.x.ai → xai provider mapping in _URL_TO_PROVIDER
- Enables proper context window, reasoning flags, and custom provider routing

This matches the models already migrated in Julien's environment (grok-4.20-0309-reasoning and grok-4-1-fast-reasoning).
@Julientalbot Julientalbot changed the title feat: official xAI / Grok model support (2M + 1M context) feat: xAI / Grok model support (2M + 1M context) Apr 8, 2026
@Julientalbot

Copy link
Copy Markdown
Contributor Author

made a mistake

@Julientalbot Julientalbot deleted the feat/xai-official-model-support branch April 8, 2026 18:40
@Julientalbot Julientalbot restored the feat/xai-official-model-support branch April 8, 2026 18:46
@Julientalbot Julientalbot reopened this Apr 8, 2026
@Julientalbot Julientalbot changed the title feat: xAI / Grok model support (2M + 1M context) feat: official xAI/Grok model support with proper 2M/1M context Apr 8, 2026
@Julientalbot Julientalbot changed the title feat: official xAI/Grok model support with proper 2M/1M context feat: official xAI/Grok model support with native 2M/1M context lengths Apr 8, 2026
@Julientalbot

Copy link
Copy Markdown
Contributor Author

Self-review:

  • Mapping intentionally done on short names ("grok-4.20") for robustness against future versions.
  • Kept "provider: custom" as fallback while adding explicit api.x.ai detection for smooth transition.
  • All unit tests (test_model_metadata.py) pass.
  • Real-world tested: get_model_context_length() correctly returns 2M for the reasoning model.

Happy to adjust the mapping style or add more test cases if desired.

Julientalbot pushed a commit to Julientalbot/hermes-agent that referenced this pull request Apr 10, 2026
xAI is listed in models.dev (`id: xai`, `env: [XAI_API_KEY]`) but without
an `api` field, so `get_provider("xai")` returned a ProviderDef with
an empty base_url. Users had to configure xAI via `provider: custom`
and manually set `base_url: https://api.x.ai/v1`, losing the ergonomics
of a first-class provider (no `hermes model --provider xai`, no
auto-detection from `XAI_API_KEY`, no label, etc.).

Add xAI as a native api_key provider alongside zai, kimi-coding, minimax,
deepseek, and friends:

- `hermes_cli/auth.py` — register PROVIDER_REGISTRY["xai"] with
  inference_base_url=https://api.x.ai/v1, api_key_env_vars=(XAI_API_KEY,),
  base_url_env_var=XAI_BASE_URL.
- `hermes_cli/providers.py` — add HERMES_OVERLAYS["xai"] with
  base_url_override, register `x-ai` and `x.ai` as aliases, and add
  "xAI" to the LABELS dict.
- `hermes_cli/models.py` — populate _PROVIDER_MODELS["xai"] with the
  11 public Grok text models (grok-4.20-*, grok-4-1-fast-*, grok-4-fast-*,
  grok-4, grok-4-0709, grok-code-fast-1, grok-3, grok-3-mini) so that
  `hermes model --provider xai` and `curated_models_for_provider("xai")`
  return a usable catalog.
- `agent/model_metadata.py` — add `api.x.ai` to _URL_TO_PROVIDER so that
  users who keep the legacy `provider: custom` + `base_url: https://api.x.ai/v1`
  configuration still get correct context-length resolution via models.dev
  (`_infer_provider_from_url` resolves to "xai" and lookup_models_dev_context
  returns the real 2M/256k/131k values per model).
- Tests: add `xai` to the existing TestProviderRegistry.test_provider_registered
  parametrize, and add a dedicated test_xai_env_vars assertion pinning
  the env var names and default base URL.

After this change, users on xAI direct can configure:

  model:
    default: grok-4.20-0309-reasoning
    provider: xai

instead of the current workaround:

  model:
    default: grok-4.20-0309-reasoning
    provider: custom
    base_url: https://api.x.ai/v1

Auto-detection from XAI_API_KEY, `hermes model` catalog listing, and
credential-pool integration all work as they do for the other api_key
providers. Complements the existing xAI integration (`x-grok-conv-id`
prompt caching, `"grok"` in TOOL_USE_ENFORCEMENT_MODELS) and
the context-length fallbacks in NousResearch#7039. Supersedes the earlier draft
PR NousResearch#6238 which proposed a narrower and incomplete fix.

21/21 tests pass on TestProviderRegistry.
@Julientalbot

Copy link
Copy Markdown
Contributor Author

Superseded by #7050 (feat: native xAI provider) and #7039 (fix: xAI Grok context length fallbacks).

Closing this draft because:

  1. The three hardcoded context-length entries (grok-4.20: 2M, grok-4: 1M, grok: 1M) are incomplete and have incorrect values — grok-4 / grok-4-0709 actually has 256k context, not 1M, and the catch-all 1M for bare grok would be wrong for grok-3/grok-2 variants. fix(model_metadata): add xAI Grok context length fallbacks #7039 adds 9 precise entries covering the full Grok family with values sourced from models.dev.
  2. The xai/grok-4.20-0309-reasoning entry in hermes_cli/models.py was added to a list that is scoped to OpenRouter-prefixed model IDs, which does not match how xAI serves these models. feat(providers): add native xAI provider #7050 adds a proper _PROVIDER_MODELS["xai"] entry with the 11 bare Grok IDs.
  3. The "api.x.ai": "xai" addition to _URL_TO_PROVIDER (the one useful line from this PR) has been integrated into feat(providers): add native xAI provider #7050.

The combined pair of #7039 + #7050 delivers everything this PR intended plus first-class provider support.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants