Skip to content

[Feature]: Add Kimi Coding as a direct inference provider #966

@louisdevzz

Description

@louisdevzz

Problem or Use Case

Moonshot AI offers two separate API surfaces for their Kimi models:

  1. Legacy Moonshot APIhttps://api.moonshot.ai/v1 — general-purpose chat, accessed via keys from platform.moonshot.cn.
  2. Kimi Code APIhttps://api.kimi.com/coding/v1 — coding-optimised endpoint with the latest kimi-coding and kimi-latest models, accessed via keys from platform.kimi.ai (prefixed sk-kimi-).

Currently, the kimi-coding provider in Hermes only points to api.moonshot.ai/v1, so users with Kimi Code keys (sk-kimi- prefix) get authentication failures when they try to use the provider. Additionally, the model list does not include the newer kimi-coding and kimi-latest model IDs offered by the Kimi Code endpoint.

Proposed Solution

1. Auto-detect endpoint from API key prefix

When KIMI_BASE_URL is not explicitly set, inspect the key:

  • Keys prefixed sk-kimi- → route to https://api.kimi.com/coding/v1
  • All other keys → route to https://api.moonshot.ai/v1 (legacy fallback)

This is transparent and requires no user action beyond pasting their key.

2. Extend the model list

Add the models available on the Kimi Code endpoint:

"kimi-coding": [
    "kimi-coding",       # latest coding-optimised model
    "kimi-latest",       # latest general model on kimi.com
    "kimi-k2.5",
    "kimi-k2-thinking",
    "kimi-k2-turbo-preview",
    "kimi-k2-0905-preview",
]

3. Update the setup wizard

Show both endpoints to the user, auto-detect which endpoint a just-entered key routes to, and confirm the endpoint in the success message:

✓ Kimi API key saved (Kimi Coding endpoint: https://api.kimi.com/coding/v1)

4. Add context window metadata

Register correct context lengths in agent/model_metadata.py:

  • kimi-coding: 200,000 tokens
  • kimi-latest: 200,000 tokens

Alternatives Considered

  • Require users to manually set KIMI_BASE_URL — This is the current state and it causes confusion. Auto-detection is strictly better UX with no downside.
  • Separate provider ID for the coding endpoint — Unnecessary complexity; the same kimi-coding provider ID can serve both endpoints transparently.

Affected Files

File Change
hermes_cli/auth.py _resolve_kimi_base_url() helper + KIMI_CODE_BASE_URL constant; wire into resolve_api_key_provider_credentials()
hermes_cli/models.py Add kimi-coding and kimi-latest to _PROVIDER_MODELS["kimi-coding"]
hermes_cli/setup.py Show both endpoint URLs; feedback on key type in setup wizard
agent/model_metadata.py Context lengths for kimi-coding (200k) and kimi-latest (200k)

Feature Type

Configuration option

Scope

Medium (few files, < 300 lines)

Contribution

  • I'd like to implement this myself and submit a PR

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions