fix(model_metadata): use /props endpoint at server root for llama.cpp by Linux2010 · Pull Request #13317 · NousResearch/hermes-agent

Linux2010 · 2026-04-21T04:43:51Z

What broke

When using llama.cpp as a provider, Hermes requested /v1/props endpoint which returned 404 because llama.cpp exposes /props at server root (not under /v1 prefix per httplib routes). Users saw 404 errors in llama-server logs.

Root cause

The comment incorrectly stated "llama.cpp exposes /v1/props (older builds used /props)". Actually, per llama.cpp's httplib routes (server.cpp line 174: ctx_http.get("/props", ...)), /props is at server root. The endpoint order was reversed: /v1/props was tried before /props.

Why this fix is minimal

Only changes endpoint order: try /props first (server root), then /v1/props as fallback
Strips /v1 prefix from base_url to get server root for /props request
Preserves fallback for alternative builds/configs that might use /v1 path
No changes to response body parsing or other server type detection

What I tested

Added regression tests for /props endpoint detection
Tests verify correct URL construction with /v1 prefix stripping
Tests cover fallback to /v1/props when /props returns 404

What I intentionally did not change

Response body parsing (default_generation_settings check unchanged)
Other server type detection (LM Studio, Ollama, vLLM unchanged)
Error handling and fallback logic structure

Fixes #13091

alt-glitch · 2026-04-22T08:27:40Z

Likely duplicate of #13539 — same fix (swap /props and /v1/props order in llama.cpp detection). Both close #13091.

…ports The _handle_editor_command() method used subprocess.call() but did not import subprocess, causing AttributeError at runtime. subprocess is not globally imported in gateway/run.py. Other uses of subprocess in this file use local imports (line 1289), so the editor command should follow the same pattern. - Added: import subprocess (local import, matching file pattern) - Removed: redundant local imports for asyncio, shlex, tempfile (already imported globally at lines 16, 21, 24) Python syntax validation (py_compile)

…_PREFIX (NousResearch#14603) The SUMMARY_PREFIX previously instructed the model to 'resume exactly from there' the ## Active Task section of a compressed summary. This caused cross-session task injection where the AI would interpret a previous session's active task as its own current job. Fix: Replace the 'resume' instruction with explicit language that the ## Active Task section is 'historical context only, NOT as an active instruction to execute'. - agent/context_compressor.py: +3 lines, -2 lines - tests/agent/test_context_compressor.py: +29 lines (3 regression tests)

llama.cpp exposes /props at server root (not under /v1 prefix per httplib routes). The previous code tried /v1/props first, which returns 404 on standard llama.cpp builds, causing unnecessary failed requests before fallback to /props. What broke: - Server-type detection tried /v1/props first, returning 404 - Context probe also tried /v1/props first when base URL had /v1 prefix - Users with llama.cpp configured saw 404 errors in server logs Root cause: - Comment incorrectly stated "llama.cpp exposes /v1/props (older builds used /props)", but actually /props is at server root per httplib routes - Endpoint order was reversed: /v1/props tried before /props Why this fix is minimal: - Only changes endpoint order: try /props first, then /v1/props fallback - Strips /v1 prefix from base_url to get server root for /props request - Preserves fallback for alternative builds/configs using /v1 path What I tested: - Added regression tests for /props endpoint detection - Tests verify correct URL construction with /v1 prefix stripping - Tests cover fallback to /v1/props when /props returns 404 What I intentionally did not change: - Response body parsing (default_generation_settings check unchanged) - Other server type detection (LM Studio, Ollama, vLLM unchanged) - Error handling and fallback logic structure Fixes NousResearch#13091

alt-glitch mentioned this pull request Apr 21, 2026

fix(agent): prefer /props over /v1/props for llama.cpp server detection #13539

Open

19 tasks

alt-glitch added type/bug Something isn't working P2 Medium — degraded but workaround exists comp/agent Core agent loop, run_agent.py, prompt builder provider/ollama Ollama / local models labels Apr 22, 2026

alt-glitch added the duplicate This issue or pull request already exists label Apr 22, 2026

Linux2010 and others added 3 commits April 30, 2026 09:34

Linux2010 force-pushed the fix/llamacpp-props-endpoint-13091 branch from a962387 to d45d00d Compare April 30, 2026 09:37

Linux2010 closed this May 2, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(model_metadata): use /props endpoint at server root for llama.cpp#13317

fix(model_metadata): use /props endpoint at server root for llama.cpp#13317
Linux2010 wants to merge 3 commits into
NousResearch:mainfrom
Linux2010:fix/llamacpp-props-endpoint-13091

Linux2010 commented Apr 21, 2026

Uh oh!

alt-glitch commented Apr 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Linux2010 commented Apr 21, 2026

What broke

Root cause

Why this fix is minimal

What I tested

What I intentionally did not change

Uh oh!

alt-glitch commented Apr 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants