Skip to content

fix(model_metadata): use /props endpoint at server root for llama.cpp#13317

Closed
Linux2010 wants to merge 3 commits into
NousResearch:mainfrom
Linux2010:fix/llamacpp-props-endpoint-13091
Closed

fix(model_metadata): use /props endpoint at server root for llama.cpp#13317
Linux2010 wants to merge 3 commits into
NousResearch:mainfrom
Linux2010:fix/llamacpp-props-endpoint-13091

Conversation

@Linux2010

Copy link
Copy Markdown
Contributor

What broke

When using llama.cpp as a provider, Hermes requested /v1/props endpoint which returned 404 because llama.cpp exposes /props at server root (not under /v1 prefix per httplib routes). Users saw 404 errors in llama-server logs.

Root cause

The comment incorrectly stated "llama.cpp exposes /v1/props (older builds used /props)". Actually, per llama.cpp's httplib routes (server.cpp line 174: ctx_http.get("/props", ...)), /props is at server root. The endpoint order was reversed: /v1/props was tried before /props.

Why this fix is minimal

  • Only changes endpoint order: try /props first (server root), then /v1/props as fallback
  • Strips /v1 prefix from base_url to get server root for /props request
  • Preserves fallback for alternative builds/configs that might use /v1 path
  • No changes to response body parsing or other server type detection

What I tested

  • Added regression tests for /props endpoint detection
  • Tests verify correct URL construction with /v1 prefix stripping
  • Tests cover fallback to /v1/props when /props returns 404

What I intentionally did not change

  • Response body parsing (default_generation_settings check unchanged)
  • Other server type detection (LM Studio, Ollama, vLLM unchanged)
  • Error handling and fallback logic structure

Fixes #13091

@alt-glitch alt-glitch added type/bug Something isn't working P2 Medium — degraded but workaround exists comp/agent Core agent loop, run_agent.py, prompt builder provider/ollama Ollama / local models labels Apr 22, 2026
@alt-glitch

Copy link
Copy Markdown
Collaborator

Likely duplicate of #13539 — same fix (swap /props and /v1/props order in llama.cpp detection). Both close #13091.

@alt-glitch alt-glitch added the duplicate This issue or pull request already exists label Apr 22, 2026
Linux2010 and others added 3 commits April 30, 2026 09:34
…ports

The _handle_editor_command() method used subprocess.call() but did not
import subprocess, causing AttributeError at runtime.

subprocess is not globally imported in gateway/run.py. Other uses of
subprocess in this file use local imports (line 1289), so the editor
command should follow the same pattern.

- Added: import subprocess (local import, matching file pattern)
- Removed: redundant local imports for asyncio, shlex, tempfile
  (already imported globally at lines 16, 21, 24)

Python syntax validation (py_compile)
…_PREFIX (NousResearch#14603)

The SUMMARY_PREFIX previously instructed the model to 'resume exactly from
there' the ## Active Task section of a compressed summary. This caused
cross-session task injection where the AI would interpret a previous
session's active task as its own current job.

Fix: Replace the 'resume' instruction with explicit language that the
## Active Task section is 'historical context only, NOT as an active
instruction to execute'.

- agent/context_compressor.py: +3 lines, -2 lines
- tests/agent/test_context_compressor.py: +29 lines (3 regression tests)
llama.cpp exposes /props at server root (not under /v1 prefix per httplib
routes). The previous code tried /v1/props first, which returns 404 on
standard llama.cpp builds, causing unnecessary failed requests before
fallback to /props.

What broke:
- Server-type detection tried /v1/props first, returning 404
- Context probe also tried /v1/props first when base URL had /v1 prefix
- Users with llama.cpp configured saw 404 errors in server logs

Root cause:
- Comment incorrectly stated "llama.cpp exposes /v1/props (older builds
  used /props)", but actually /props is at server root per httplib routes
- Endpoint order was reversed: /v1/props tried before /props

Why this fix is minimal:
- Only changes endpoint order: try /props first, then /v1/props fallback
- Strips /v1 prefix from base_url to get server root for /props request
- Preserves fallback for alternative builds/configs using /v1 path

What I tested:
- Added regression tests for /props endpoint detection
- Tests verify correct URL construction with /v1 prefix stripping
- Tests cover fallback to /v1/props when /props returns 404

What I intentionally did not change:
- Response body parsing (default_generation_settings check unchanged)
- Other server type detection (LM Studio, Ollama, vLLM unchanged)
- Error handling and fallback logic structure

Fixes NousResearch#13091
@Linux2010 Linux2010 force-pushed the fix/llamacpp-props-endpoint-13091 branch from a962387 to d45d00d Compare April 30, 2026 09:37
@Linux2010 Linux2010 closed this May 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/agent Core agent loop, run_agent.py, prompt builder duplicate This issue or pull request already exists P2 Medium — degraded but workaround exists provider/ollama Ollama / local models type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: /v1/props should be GET /props

2 participants