Skip to content

docs(inference): document local tool-call reliability (Fixes #2733)#2823

Merged
ericksoa merged 4 commits into
NVIDIA:mainfrom
deepujain:docs/2733-tool-calling-reliability
May 13, 2026
Merged

docs(inference): document local tool-call reliability (Fixes #2733)#2823
ericksoa merged 4 commits into
NVIDIA:mainfrom
deepujain:docs/2733-tool-calling-reliability

Conversation

@deepujain

@deepujain deepujain commented May 1, 2026

Copy link
Copy Markdown
Contributor

Summary

Adds a dedicated local-inference troubleshooting page for the Ollama tool-call leak symptom where raw JSON appears in the TUI instead of a tool dispatch.

Changes

  • Document when Ollama is reasonable and when vLLM with --enable-auto-tool-choice plus --tool-call-parser is the safer local backend.
  • Add known-good vLLM command and Compose examples for Hermes-style tool parsing.
  • Explain the persistent NemoClaw path through nemoclaw onboard and the advanced temporary openclaw config set --batch-file path.
  • Link the new page from local inference docs, inference options, docs navigation, and the inference configuration skill references.

Testing

  • bash test/e2e/e2e-cloud-experimental/check-docs.sh --only-links --local-only docs/inference/tool-calling-reliability.md docs/inference/use-local-inference.md docs/inference/inference-options.md docs/index.md .agents/skills/nemoclaw-user-configure-inference/references/tool-calling-reliability.md .agents/skills/nemoclaw-user-configure-inference/references/use-local-inference.md .agents/skills/nemoclaw-user-configure-inference/references/inference-options.md
  • npm install --ignore-scripts
  • npm test -- test/check-docs-links.test.ts
  • npm run build:cli

Fixes #2733

Signed-off-by: Deepak Jain deepujain@gmail.com

Summary by CodeRabbit

  • Documentation
    • Added new troubleshooting guide for tool-calling reliability issues with local inference
    • Enhanced Ollama setup guidance with caution about JSON output handling and vLLM alternatives
    • Clarified when experimental vLLM options appear during configuration
    • Improved verification procedures for inference provider selection
    • Added decision matrix to help users choose between Ollama and vLLM based on workload requirements

Fixes NVIDIA#2733

Signed-off-by: Deepak Jain <deepujain@gmail.com>
@copy-pr-bot

copy-pr-bot Bot commented May 1, 2026

Copy link
Copy Markdown

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@coderabbitai

coderabbitai Bot commented May 1, 2026

Copy link
Copy Markdown
Contributor

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 49e77634-ded6-46fa-8443-d18030e3f956

📥 Commits

Reviewing files that changed from the base of the PR and between 12bac38 and 2e6e58c.

📒 Files selected for processing (6)
  • .agents/skills/nemoclaw-user-configure-inference/SKILL.md
  • .agents/skills/nemoclaw-user-configure-inference/references/inference-options.md
  • .agents/skills/nemoclaw-user-configure-inference/references/switch-inference-providers.md
  • docs/index.md
  • docs/inference/inference-options.md
  • docs/inference/use-local-inference.md
✅ Files skipped from review due to trivial changes (3)
  • docs/inference/use-local-inference.md
  • docs/index.md
  • .agents/skills/nemoclaw-user-configure-inference/SKILL.md
🚧 Files skipped from review as they are similar to previous changes (1)
  • docs/inference/inference-options.md

📝 Walkthrough

Walkthrough

This PR documents a tool-calling failure mode in Ollama where tool calls render as raw JSON text instead of structured tool_calls, and provides comprehensive troubleshooting guidance directing users toward vLLM with parser flags for reliable multi-tool agent workloads.

Changes

Tool-Calling Reliability Documentation & Inference Configuration Updates

Layer / File(s) Summary
New troubleshooting guides for tool-calling reliability
.agents/skills/nemoclaw-user-configure-inference/references/tool-calling-reliability.md, docs/inference/tool-calling-reliability.md
Introduces a comprehensive two-part guide (skill-resident and public docs) explaining the symptom (JSON leakage as plain text), providing a decision matrix (Ollama vs vLLM by workload), and offering concrete remediation steps: vLLM startup commands with --enable-auto-tool-choice and --tool-call-parser hermes, Docker Compose setup, onboarding via environment variables, temporary repointing via openclaw config set --batch-file, and verification checklist.
Inference configuration guidance updates
.agents/skills/nemoclaw-user-configure-inference/SKILL.md, .agents/skills/nemoclaw-user-configure-inference/references/inference-options.md
Expands skill description keywords to include tool-calling terms, adds caution about Ollama JSON leakage with remediation path to vLLM, clarifies vLLM onboarding flows (experimental local vs managed install/start), documents qwen3.6:35b on 32+ GiB GPU hosts, updates authentication requirements and health-check behavior for Ollama reverse proxy, and removes NEMOCLAW_EXPERIMENTAL=1 from vLLM non-interactive example.
Verification & switching guidance refinements
.agents/skills/nemoclaw-user-configure-inference/references/switch-inference-providers.md
Tightens prerequisites to require OpenShell CLI on PATH, replaces verification flow from nemoclaw <name> status alone to nemoclaw inference get as primary confirmation of active gateway route, and repositions nemoclaw <name> status for broader health context.
Cross-document navigation integration
docs/index.md, docs/inference/inference-options.md, docs/inference/use-local-inference.md
Adds "Tool-Calling Reliability" entry to Inference toctree and Next Steps sections across multiple inference docs, and adds caution block in use-local-inference.md describing JSON leakage symptom with vLLM remediation link.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~15 minutes

Suggested reviewers

  • cv
  • ericksoa

Poem

🐰 A tool call caught mid-leap—
JSON dreams in text so deep,
But vLLM's parser bright
Sets the workflow right,
Now the agent dances while we sleep!

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and concisely describes the main change: adding documentation for local tool-call reliability. It is specific, action-oriented, and directly reflects the primary deliverable of the PR.
Linked Issues check ✅ Passed The PR fully meets the primary coding requirements from issue #2733: comprehensive documentation explaining when Ollama is sufficient versus when vLLM with tool-call parser is required, including known-good vLLM commands with flags, Docker Compose examples, NemoClaw configuration paths, and cross-linked references throughout the documentation.
Out of Scope Changes check ✅ Passed All changes are directly related to documenting local tool-call reliability and supporting infrastructure. Updates to SKILL.md, inference-options.md, use-local-inference.md, and switch-inference-providers.md are focused revisions supporting the primary troubleshooting guide with no extraneous modifications.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
docs/inference/tool-calling-reliability.md (1)

180-185: ⚡ Quick win

Use "Next Steps" instead of "Related Pages".

Per coding guidelines for page structure, the final section should be titled "Next Steps" to maintain consistency across documentation pages.

📝 Suggested heading change
-## Related Pages
+## Next Steps

As per coding guidelines: "A 'Next Steps' section at the bottom links to related pages."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/inference/tool-calling-reliability.md` around lines 180 - 185, Replace
the final section heading "Related Pages" with "Next Steps" and keep the three
existing links (- [Use a Local Inference Server], - [Inference Options], -
[Switch Inference Models]) unchanged; locate the heading text "Related Pages" in
the document and update only the heading label so the section conforms to the
documentation guideline requiring a "Next Steps" section.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@docs/inference/tool-calling-reliability.md`:
- Around line 180-185: Replace the final section heading "Related Pages" with
"Next Steps" and keep the three existing links (- [Use a Local Inference
Server], - [Inference Options], - [Switch Inference Models]) unchanged; locate
the heading text "Related Pages" in the document and update only the heading
label so the section conforms to the documentation guideline requiring a "Next
Steps" section.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 87635c10-7a2f-4323-a7a3-a41be2b2ab9a

📥 Commits

Reviewing files that changed from the base of the PR and between f9d21af and 170de76.

📒 Files selected for processing (8)
  • .agents/skills/nemoclaw-user-configure-inference/SKILL.md
  • .agents/skills/nemoclaw-user-configure-inference/references/inference-options.md
  • .agents/skills/nemoclaw-user-configure-inference/references/tool-calling-reliability.md
  • .agents/skills/nemoclaw-user-configure-inference/references/use-local-inference.md
  • docs/index.md
  • docs/inference/inference-options.md
  • docs/inference/tool-calling-reliability.md
  • docs/inference/use-local-inference.md

Signed-off-by: Deepak Jain <deepujain@gmail.com>
@deepujain

Copy link
Copy Markdown
Contributor Author

Updated the final docs heading to Next Steps and reran the docs link check plus CLI build.

Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
@cv cv requested a review from miyoungc May 11, 2026 22:25
@cv cv added v0.0.40 and removed v0.0.39 labels May 12, 2026
…g-reliability

Signed-off-by: Aaron Erickson <aerickson@nvidia.com>

# Conflicts:
#	.agents/skills/nemoclaw-user-configure-inference/SKILL.md
#	.agents/skills/nemoclaw-user-configure-inference/references/use-local-inference.md
@ericksoa ericksoa force-pushed the docs/2733-tool-calling-reliability branch from 6c6b194 to 2e6e58c Compare May 13, 2026 02:47

@ericksoa ericksoa left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed the resolved docs-only change after the current-main merge. The conflict resolution preserves the newer generated skill layout from main, keeps the tool-calling reliability guide wired into docs and generated skills, and passed the docs validation set locally.

@ericksoa ericksoa enabled auto-merge (squash) May 13, 2026 02:51
@ericksoa ericksoa merged commit 098a5a9 into NVIDIA:main May 13, 2026
19 of 21 checks passed
@wscurran wscurran added area: inference Inference routing, serving, model selection, or outputs area: local-models Local model providers, downloads, launch, or connectivity area: providers Inference provider integrations and provider behavior feature PR adds or expands user-visible functionality and removed Local Models labels Jun 3, 2026
@wscurran wscurran added area: docs Documentation, examples, guides, or docs build and removed enhancement: inference labels Jun 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area: docs Documentation, examples, guides, or docs build area: inference Inference routing, serving, model selection, or outputs area: local-models Local model providers, downloads, launch, or connectivity area: providers Inference provider integrations and provider behavior feature PR adds or expands user-visible functionality

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Tool-call reliability: detect / document Ollama tool-call-leak failure mode and recommend vLLM with parser flag for tool-calling agents

5 participants