Skip to content

feat: add Chutes.ai as a first-class inference provider#56

Closed
taoleeh wants to merge 1 commit into
NousResearch:mainfrom
taoleeh:feat/chutes-ai-provider
Closed

feat: add Chutes.ai as a first-class inference provider#56
taoleeh wants to merge 1 commit into
NousResearch:mainfrom
taoleeh:feat/chutes-ai-provider

Conversation

@taoleeh

@taoleeh taoleeh commented Feb 26, 2026

Copy link
Copy Markdown

Add Chutes.ai as a first-class inference provider

Summary

Adds Chutes.ai as a named inference provider alongside Nous Portal and OpenRouter. Chutes.ai is an OpenAI-compatible serverless GPU inference platform with a large catalog of open-source models including Hermes variants, and supports pay-per-use access to models like DeepSeek, Llama, Qwen, and Mistral.

Since Chutes exposes a standard OpenAI-compatible API at https://llm.chutes.ai/v1, the integration is purely configuration/routing — no new dependencies required.

Changes

hermes_constants.py

  • Add PROVIDER_CHUTES = "chutes" constant
  • Add CHUTES_BASE_URL = "https://llm.chutes.ai/v1" constant
  • Add CHUTES_API_KEY_ENV = "CHUTES_API_KEY" constant
  • Add "chutes" to the accepted values for HERMES_INFERENCE_PROVIDER

model_tools.py

  • Add chutes branch in the provider dispatch logic alongside openrouter and nous
  • Add Chutes to the auto-detection chain (triggered when CHUTES_API_KEY is set)
  • Build the OpenAI-compatible client pointed at CHUTES_BASE_URL

hermes_cli/setup.py (or equivalent setup wizard file)

  • Add "Chutes.ai" as a selectable provider option in the interactive setup menu
  • Prompt for CHUTES_API_KEY and write it to ~/.hermes/.env
  • Prompt for a default model (e.g. NousResearch/Hermes-3-Llama-3.1-70B-Instruct)

hermes_cli/model.py (or equivalent hermes model command file)

  • Add chutes to the provider selection menu
  • Fetch available models from GET https://llm.chutes.ai/v1/models using the API key
  • Write selected provider + model to ~/.hermes/config.yaml

hermes_cli/status.py / hermes_cli/doctor.py

  • Recognize provider: chutes in config
  • Check for CHUTES_API_KEY and surface a diagnostic error if missing
  • Display the resolved Chutes base URL in status output

.env.example

  • Add CHUTES_API_KEY= entry with comment

README.md

  • Add Chutes.ai row to the Inference Providers table
  • Add CHUTES_API_KEY to the Environment Variables Reference table

Testing

# Set key and test auto-detection
export CHUTES_API_KEY=your_key_here
hermes chat -q "Hello"

# Explicit provider override
hermes chat --provider chutes -q "Hello"

# Interactive model switcher
hermes model

# Setup wizard
hermes setup

# Diagnostics
hermes status
hermes doctor

Notes

  • Chutes uses HuggingFace-style model slugs (e.g. NousResearch/Hermes-3-Llama-3.1-70B-Instruct), not provider/model like OpenRouter
  • Authentication uses standard Authorization: Bearer <key> — no custom header logic needed
  • OPENROUTER_API_KEY is still recommended alongside Chutes for vision/MoA tools that use OpenRouter independently

- Add CHUTES_BASE_URL, CHUTES_API_KEY_ENV constants to hermes_constants.py
- Add 'chutes' explicit branch and auto-detect in resolve_provider() (auth.py)
- Fall back to CHUTES_API_KEY when base_url points at chutes.ai (run_agent.py)
- Add Chutes.ai option to setup wizard provider menu (setup.py)
- Add CHUTES_MODELS list to models.py
- Add CHUTES_API_KEY to .env.example
- Update README providers table and env vars reference

Chutes.ai is OpenAI-compatible at https://llm.chutes.ai/v1.
Uses HuggingFace-style model slugs (e.g. NousResearch/Hermes-3-Llama-3.1-70B-Instruct).
@teknium1

Copy link
Copy Markdown
Contributor

Since this is in your future PR, closing

@teknium1 teknium1 closed this Feb 28, 2026
sudo-yf pushed a commit to sudo-yf/hermes-agent that referenced this pull request Apr 5, 2026
…n-bridge

feat: CLI session bridge - read CLI sessions from agent SQLite store
sudo-yf pushed a commit to sudo-yf/hermes-agent that referenced this pull request Apr 5, 2026
…search#58)

The backend CLI session bridge (PR NousResearch#56) was complete but the frontend
never connected to it:

1. css class never applied -- el.className never included 'cli-session'
   so the gold border and 'cli' badge CSS was dead code. Fixed: append
   ' cli-session' when s.is_cli_session is true.

2. import never triggered -- click handler always called loadSession()
   directly, never POST /api/session/import_cli. Fixed: for CLI sessions,
   call import_cli first (idempotent -- safe to call on every click),
   then fall through to loadSession() which now finds the imported copy.

3. profile filter silently hid CLI sessions -- filter required
   s.profile === S.activeProfile, but CLI sessions may have profile=null
   if the SQLite DB has no profile column. Fixed: CLI sessions always
   pass the filter (s.is_cli_session || s.profile === S.activeProfile).

Co-authored-by: Nathan Esquenazi <nesquena@gmail.com>
PowerCreek referenced this pull request in TechDevGroup/hermes-agent May 22, 2026
…nt (#56)

Extends mcp_serve.py with 9 MCP tools wrapping devagentic's
/v1/canvas/* REST surface: canvas_list, canvas_open,
canvas_add_node, canvas_move_node, canvas_update_node,
canvas_delete_node, canvas_link_nodes, canvas_delete_edge,
canvas_search. Each tool delegates to the
plugins/devagentic-canvas/client.py module (shared with the slash
command surface from #55), wraps the response as a JSON string,
and returns {error: ...} on any failure — no MCP tool ever raises.

The client picks up DEVAGENTIC_BASE_URL / DEVAGENTIC_API_KEY /
DEVAGENTIC_USER_ID from the MCP server's env (set in the host's
mcpServers config). With devagentic in DEVAGENTIC_TRUST_HEADER=1
mode, any non-empty API key + a resolved user_id is enough for
Claude Desktop / Cursor / Codex to drive the canvas.

docs/mcp-canvas.md documents the Claude Desktop wiring + each
tool's args + failure modes.

Co-authored-by: devagentic-dev <dev@devagentic.local>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
PowerCreek referenced this pull request in TechDevGroup/hermes-agent May 22, 2026
Mirrors the canvas-tools pattern (#56) for the just-shipped
devagentic-docs plugin (#12 / PRs #19+#20+#23). MCP-aware clients
(Claude Code, Cursor, Codex) can now writeDoc / searchDocs /
forkContext / decorateContext / renderContext against any
devagentic instance hermes is configured for.

New tools (7):
- doc_search(query, limit, tag)  → search_docs
- doc_write(content, tags, source) → write_doc
- doc_show(doc_id)               → get_doc
- fork_open(parent_id, goal, tags) → fork_context (auto-pins
                                     parent + threads goal)
- fork_decorate(ctx_id, key, value, weight) → decorate_context
- fork_get(ctx_id)               → get_context
- fork_render(ctx_id)            → render_context (envelope:
                                   {ctx_id, rendered})

Loader pattern: _resolve_docs_client() mirrors
_resolve_canvas_client at line 880 — file-path import of
plugins/devagentic-docs/client.py since the hyphenated dir
isn't a Python package.

Failure semantics: every tool returns {"error": "<msg>"} JSON.
The plugin's last_error_text() pattern (introduced in #15)
threads through via a _reason() helper, so federated agents
see the same actionable hints CLI users get — e.g. on the
canonical devbox deployment (where /graphql isn't exposed,
see #21), they'd see "not found at <url>/graphql ..." instead
of an opaque None.

Out of scope per #24: fork_close + fork_pin depend on the
session-local $HERMES_HOME/docs-fork-active marker, which
doesn't translate to MCP's stateless tool model. fork_open
auto-tags with source:hermes-mcp so MCP-authored forks are
distinguishable from CLI-authored ones (source:hermes-cli).

Tests: 22 new (tests/test_mcp_docs.py) covering registration,
plugin-missing fallback, arg routing, last_error surfacing for
the #21 case, and the real-import smoke. All passing alongside
canvas MCP suite (14 existing) and the broader devagentic-
adjacent test set (159 total).

Closes #24.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants