Skip to content

fix(venice): harden discovery limits and tool support#38306

Merged
vincentkoc merged 10 commits intomainfrom
vincentkoc-code/venice-provider-cluster-fix
Mar 7, 2026
Merged

fix(venice): harden discovery limits and tool support#38306
vincentkoc merged 10 commits intomainfrom
vincentkoc-code/venice-provider-cluster-fix

Conversation

@vincentkoc
Copy link
Copy Markdown
Member

Summary

Describe the problem and fix in 2–5 bullets:

  • Problem: Venice discovery and fallback metadata had drifted from the live API, and OpenClaw still wired tools into Venice models that do not support function calling.
  • Why it matters: default Venice setups could fail with HTTP 400 errors from oversized max_completion_tokens defaults or tools is not supported requests.
  • What changed: discovery now applies bounded per-model maxCompletionTokens, the static Venice catalog is synced to current live model metadata, and embedded runs/compaction suppress tools for models with compat.supportsTools === false.
  • What did NOT change (scope boundary): this PR does not change non-Venice provider behavior or broaden tool availability for other providers.

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

User-visible / Behavior Changes

  • Venice onboarding/default model selection now uses safe per-model completion-token limits instead of a stale shared default.
  • Venice models that do not support function calling no longer receive tool wiring during embedded runs or compaction.
  • Offline/degraded Venice fallback uses a catalog synced to the current live Venice model list.

Security Impact (required)

  • New permissions/capabilities? (No)
  • Secrets/tokens handling changed? (No)
  • New/changed network calls? (No)
  • Command/tool execution surface changed? (Yes)
  • Data access scope changed? (No)
  • If any Yes, explain risk + mitigation:
    Tool execution is reduced, not expanded: Venice models that advertise no function-calling support now have tools suppressed. Discovery still reads the existing public Venice /models endpoint, but API-provided maxCompletionTokens values are normalized and clamped to safe bounds before use.

Repro + Verification

Environment

  • OS: macOS
  • Runtime/container: Node 22 / pnpm
  • Model/provider: Venice
  • Integration/channel (if any): n/a
  • Relevant config (redacted): default Venice discovery + embedded runner paths

Steps

  1. Discover Venice models from the live/static catalog.
  2. Start an embedded run with a Venice model that has a lower completion limit or no function-calling support.
  3. Verify model defaults and tool wiring used by the run.

Expected

  • Venice models use bounded per-model completion-token limits.
  • Venice models without function-calling support do not receive tools.
  • Static fallback catalog contains the current Venice model set/metadata.

Actual

  • Targeted tests and full build pass with the updated Venice discovery, fallback, and tool-gating logic.

Evidence

Attach at least one:

  • Failing test/log before + passing after
  • Trace/log snippets
  • Screenshot/recording
  • Perf numbers (if relevant)

Human Verification (required)

What you personally verified (not just CI), and how:

  • Verified scenarios: catalog-known Venice models honoring bounded API maxCompletionTokens, catalog fallback retaining synced metadata, unknown models getting conservative bounded defaults, and tool suppression for non-function-calling Venice models.
  • Edge cases checked: missing maxCompletionTokens, oversized maxCompletionTokens, unsupported tools on known and unknown Venice models, post-rebase targeted tests.
  • What you did not verify: live end-to-end Venice API calls with authenticated completions from a real account.

Compatibility / Migration

  • Backward compatible? (Yes)
  • Config/env changes? (No)
  • Migration needed? (No)
  • If yes, exact upgrade steps:

Failure Recovery (if this breaks)

  • How to disable/revert this change quickly: revert this PR or disable Venice model discovery and fall back to explicit configured models.
  • Files/config to restore: src/agents/venice-models.ts, src/agents/pi-embedded-runner/run/attempt.ts, src/agents/pi-embedded-runner/compact.ts
  • Known bad symptoms reviewers should watch for: Venice models losing tool access unexpectedly, stale catalog entries reappearing, or Venice requests still failing with max_completion_tokens / unsupported-tools 400s.

Risks and Mitigations

List only real risks for this PR. Add/remove entries as needed. If none, write None.

  • Risk: the static Venice catalog could drift again as Venice changes its public model list.
    • Mitigation: live discovery still overrides known entries within safe bounds, and the fallback catalog is now synced to the current live API state.
  • Risk: some Venice models may actually support tools despite stale metadata.
    • Mitigation: tool suppression is only applied when the catalog or live API explicitly reports no function-calling support.

@vincentkoc vincentkoc self-assigned this Mar 6, 2026
@openclaw-barnacle openclaw-barnacle Bot added the agents Agent runtime and tooling label Mar 6, 2026
@openclaw-barnacle openclaw-barnacle Bot added size: M maintainer Maintainer-authored PR labels Mar 6, 2026
@vincentkoc vincentkoc marked this pull request as ready for review March 6, 2026 20:08
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Mar 6, 2026

Greptile Summary

This PR hardens Venice provider integration in three complementary areas: (1) per-model completion-token limits are now applied from live discovery data and clamped to safe bounds, (2) the static fallback catalog is re-synced to the current Venice model list (including new models, reclassified privacy levels, corrected IDs like claude-opus-4-5/claude-sonnet-4-5, and a fixed name mismatch for minimax-m21), and (3) a new supportsModelTools gate prevents tool/function-calling requests from being sent to Venice models that advertise no function-calling support, in both the embedded runner and the compaction path.

Key observations:

  • The token-clamping strategy in resolveApiMaxCompletionTokens is intentionally one-directional: live API values can lower but never raise a catalog model's maxTokens. This prevents oversized values from slipping through but means catalog drift must be fixed manually; a comment documenting this would help future maintainers.
  • deepseek-v3.2 retains reasoning: true from the old catalog. DeepSeek V3.2 is a dense chat/instruct model (not a reasoning/thinking model like R1 variants), so this flag appears incorrect and could cause the runner to apply reasoning-specific request formatting to it.
  • Minor: the CHANGELOG attribution truncates @vincentkoc as @vincentko.
  • All other logic — the supportsTools propagation through buildVeniceModelDefinition, discovery, and both execution paths — is correct and well-tested.

Confidence Score: 4/5

  • Safe to merge; changes reduce failure surface by suppressing unsupported tool calls and clamping token limits, with no broadening of capabilities.
  • The core logic is correct and well-tested. The one point deducted is for deepseek-v3.2 carrying reasoning: true — a pre-existing but unfixed misclassification that this catalog-sync PR was well-positioned to correct, and which could cause reasoning-specific request formatting to be applied to a non-reasoning model.
  • src/agents/venice-models.ts — the deepseek-v3.2 reasoning: true flag and the undocumented one-way tool suppression gate deserve a second look.

Comments Outside Diff (1)

  1. src/agents/venice-models.ts, line 163-172 (link)

    deepseek-v3.2 is marked as a reasoning model

    deepseek-v3.2 has reasoning: true in the catalog, but DeepSeek V3.2 is a dense chat/instruct model, not a reasoning/thinking model (that role belongs to DeepSeek R1 variants). This was present in the previous catalog and is left unchanged here, but the updated catalog sync pass is a natural opportunity to correct it. Incorrect reasoning flags affect how the runner formats requests (e.g. thinking budgets, effort parameters).

    Prompt To Fix With AI
    This is a comment left during a code review.
    Path: src/agents/venice-models.ts
    Line: 163-172
    
    Comment:
    **`deepseek-v3.2` is marked as a reasoning model**
    
    `deepseek-v3.2` has `reasoning: true` in the catalog, but DeepSeek V3.2 is a dense chat/instruct model, not a reasoning/thinking model (that role belongs to DeepSeek R1 variants). This was present in the previous catalog and is left unchanged here, but the updated catalog sync pass is a natural opportunity to correct it. Incorrect `reasoning` flags affect how the runner formats requests (e.g. thinking budgets, effort parameters).
    
    
    
    How can I resolve this? If you propose a fix, please make it concise.

Last reviewed commit: c9d1e63

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c9d1e63cee

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/agents/venice-models.ts Outdated
Comment thread CHANGELOG.md Outdated
Comment thread src/agents/venice-models.ts
@vincentkoc
Copy link
Copy Markdown
Member Author

Addressed the current review items in follow-up commit 94d7813:

  • capped unknown-model maxTokens against the same 128000 fallback context window used for degraded Venice metadata, with a regression test for missing availableContextTokens
  • added an inline comment documenting the intentional one-way supportsTools gate for catalog-known models
  • rechecked the changelog attribution report; the current branch already has the correct handles, so there was no code change needed there

Re-ran:

  • pnpm exec vitest run src/agents/venice-models.test.ts src/agents/model-tool-support.test.ts src/config/config-misc.test.ts
  • pnpm build

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 6d85562052

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/agents/venice-models.ts Outdated
@vincentkoc
Copy link
Copy Markdown
Member Author

Addressed the partial-metadata review point in follow-up commit c6b1832.

Changes:

  • made Venice model_spec.availableContextTokens and model_spec.capabilities reads tolerant of partial /models records
  • guarded tool/reasoning/vision capability reads so one malformed entry no longer aborts discovery for the whole payload
  • added a regression test covering mixed valid + partial discovery payloads

Re-ran:

  • pnpm exec vitest run src/agents/venice-models.test.ts src/agents/model-tool-support.test.ts src/config/config-misc.test.ts
  • pnpm build

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c6b1832406

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/agents/venice-models.ts Outdated
@vincentkoc
Copy link
Copy Markdown
Member Author

Addressed the missing-model_spec regression in follow-up commit 96ea772.

Changes:

  • made Venice model_spec optional in discovery parsing
  • guarded maxCompletionTokens, context-window, name, and tool-capability reads behind optional access
  • added a regression test proving a malformed known-model row without model_spec no longer aborts discovery for the rest of the payload

Re-ran:

  • pnpm exec vitest run src/agents/venice-models.test.ts src/agents/model-tool-support.test.ts src/config/config-misc.test.ts
  • pnpm build

@vincentkoc vincentkoc merged commit 5320ee7 into main Mar 7, 2026
27 of 28 checks passed
@vincentkoc vincentkoc deleted the vincentkoc-code/venice-provider-cluster-fix branch March 7, 2026 00:07
vincentkoc added a commit to BryanTegomoh/openclaw-upstream that referenced this pull request Mar 8, 2026
* Config: add supportsTools compat flag

* Agents: add model tool support helper

* Venice: sync discovery and fallback metadata

* Agents: skip tools for unsupported models

* Changelog: note Venice provider hardening

* Update CHANGELOG.md

* Venice: cap degraded discovery metadata

* Apply suggestion from @greptile-apps[bot]

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>

* Venice: tolerate partial discovery capabilities

* Venice: tolerate missing discovery specs

---------

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
jenawant pushed a commit to jenawant/openclaw that referenced this pull request Mar 10, 2026
* Config: add supportsTools compat flag

* Agents: add model tool support helper

* Venice: sync discovery and fallback metadata

* Agents: skip tools for unsupported models

* Changelog: note Venice provider hardening

* Update CHANGELOG.md

* Venice: cap degraded discovery metadata

* Apply suggestion from @greptile-apps[bot]

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>

* Venice: tolerate partial discovery capabilities

* Venice: tolerate missing discovery specs

---------

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
V-Gutierrez pushed a commit to V-Gutierrez/openclaw-vendor that referenced this pull request Mar 17, 2026
* Config: add supportsTools compat flag

* Agents: add model tool support helper

* Venice: sync discovery and fallback metadata

* Agents: skip tools for unsupported models

* Changelog: note Venice provider hardening

* Update CHANGELOG.md

* Venice: cap degraded discovery metadata

* Apply suggestion from @greptile-apps[bot]

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>

* Venice: tolerate partial discovery capabilities

* Venice: tolerate missing discovery specs

---------

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
alexey-pelykh pushed a commit to remoteclaw/remoteclaw that referenced this pull request Mar 20, 2026
(cherry picked from commit 5320ee7)

Partial: only types.models.ts (supportsTools field) — gutted files discarded
alexey-pelykh pushed a commit to remoteclaw/remoteclaw that referenced this pull request Mar 20, 2026
(cherry picked from commit 5320ee7)

Partial: only types.models.ts (supportsTools field) — gutted files discarded
lovewanwan pushed a commit to lovewanwan/openclaw that referenced this pull request Apr 28, 2026
* Config: add supportsTools compat flag

* Agents: add model tool support helper

* Venice: sync discovery and fallback metadata

* Agents: skip tools for unsupported models

* Changelog: note Venice provider hardening

* Update CHANGELOG.md

* Venice: cap degraded discovery metadata

* Apply suggestion from @greptile-apps[bot]

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>

* Venice: tolerate partial discovery capabilities

* Venice: tolerate missing discovery specs

---------

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
ogt-redknie pushed a commit to ogt-redknie/OPENX that referenced this pull request May 2, 2026
* Config: add supportsTools compat flag

* Agents: add model tool support helper

* Venice: sync discovery and fallback metadata

* Agents: skip tools for unsupported models

* Changelog: note Venice provider hardening

* Update CHANGELOG.md

* Venice: cap degraded discovery metadata

* Apply suggestion from @greptile-apps[bot]

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>

* Venice: tolerate partial discovery capabilities

* Venice: tolerate missing discovery specs

---------

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling maintainer Maintainer-authored PR size: L

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: Erroenous HTTP 400 errors when using Venice.AI due to too high hard-coded max_completion_tokens

1 participant