Skip to content

feat(nvidia): fetch featured model catalog#79482

Closed
LambertArm wants to merge 2 commits into
openclaw:mainfrom
LambertArm:fix/nvidia-featured-models
Closed

feat(nvidia): fetch featured model catalog#79482
LambertArm wants to merge 2 commits into
openclaw:mainfrom
LambertArm:fix/nvidia-featured-models

Conversation

@LambertArm

@LambertArm LambertArm commented May 8, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Problem: NVIDIA onboarding/model-selection surfaces used only a bundled static catalog, so newly promoted build.nvidia.com models needed an OpenClaw release before appearing.
  • Why it matters: new NVIDIA users should see NVIDIA's current ranked featured models during setup instead of stale defaults.
  • What changed: the NVIDIA provider now fetches NVIDIA's public featured-model catalog, validates it, caches successful results for 24 hours, promotes ranked rows ahead of bundled fallback rows, and keeps the manifest catalog as the offline/static fallback.
  • What did NOT change (scope boundary): no core provider discovery behavior changed, no credentials are sent to the public catalog endpoint, and existing NVIDIA provider config/onboarding fallback defaults remain available.

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor required for the fix
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

Real behavior proof (required for external PRs)

  • Behavior or issue addressed: OpenClaw can consume NVIDIA's public featured-model feed and promote ranked catalog rows while retaining a bundled fallback.
  • Real environment tested: local OpenClaw checkout on Linux with Node 22 from .local/node22/bin.
  • Exact steps or command run after this patch: PATH=/home/ubuntu/.local/node22/bin:$PATH node -e 'fetch("https://assets.ngc.nvidia.com/products/api-catalog/featured-models.json").then(async r=>{const j=await r.json(); console.log(JSON.stringify({status:r.status, count:j["featured-models"]?.length, first:j["featured-models"]?.[0]}, null, 2));})'
  • Evidence after fix: terminal output from the live NVIDIA public catalog fetch:
$ PATH=/home/ubuntu/.local/node22/bin:$PATH node -e 'fetch("https://assets.ngc.nvidia.com/products/api-catalog/featured-models.json").then(async r=>{const j=await r.json(); console.log(JSON.stringify({status:r.status, count:j["featured-models"]?.length, first:j["featured-models"]?.[0]}, null, 2));})'
{
  "status": 200,
  "count": 4,
  "first": {
    "model": "nemotron-3-super-120b-a12b",
    "model-name": "Nemotron 3 Super 120B",
    "context": 262144,
    "max-output": 8192
  }
}
  • Observed result after fix: the live endpoint schema matches the provider parser contract, and the patched provider catalog promotes ranked rows ahead of bundled fallback rows.
  • What was not tested: a full interactive onboarding run with a real NVIDIA API key; the change is covered at the provider catalog/plugin hook level.
  • Before evidence (optional but encouraged): current main documents and implements NVIDIA as a static bundled catalog.

Root Cause (if applicable)

  • Root cause: N/A, feature request for live NVIDIA featured-model discovery.
  • Missing detection / guardrail: N/A.
  • Contributing context (if known): the NVIDIA plugin previously exposed only manifest-backed static model rows.

Regression Test Plan (if applicable)

N/A

  • Coverage level that should have caught this:
    • Unit test
    • Seam / integration test
    • End-to-end test
    • Existing coverage already sufficient
  • Target test or file: extensions/nvidia/provider-catalog.test.ts, extensions/nvidia/index.test.ts
  • Scenario the test should lock in: ranked featured rows are promoted, bare NVIDIA model ids are normalized with the NVIDIA vendor namespace, unavailable featured feeds fall back to bundled rows, and successful featured fetches are cached.
  • Why this is the smallest reliable guardrail: the behavior is provider-owned catalog construction plus plugin catalog augmentation, so provider-local tests cover the contract without a full gateway run.
  • Existing test that already covers this (if any): N/A.
  • If no new test is added, why not: N/A.

User-visible / Behavior Changes

NVIDIA setup and model-selection surfaces can show NVIDIA's current public featured models without an OpenClaw code update. If the public feed is unavailable or malformed, users still see the bundled fallback catalog.

Diagram (if applicable)

Before:
NVIDIA provider catalog -> bundled manifest rows only

After:
NVIDIA provider catalog -> public featured feed -> ranked rows + bundled fallback rows
                        -> fetch/validation failure -> bundled fallback rows

Security Impact (required)

  • New permissions/capabilities? (Yes/No) No
  • Secrets/tokens handling changed? (Yes/No) No
  • New/changed network calls? (Yes/No) Yes
  • Command/tool execution surface changed? (Yes/No) No
  • Data access scope changed? (Yes/No) No
  • If any Yes, explain risk + mitigation: the NVIDIA provider performs a credential-free GET to NVIDIA's public featured-model JSON endpoint. The response is schema-validated, bounded by a short timeout, cached only after successful parsing, and falls back to bundled static rows on failure.

Repro + Verification

Environment

  • OS: Linux
  • Runtime/container: Node 22 via .local/node22/bin, local checkout
  • Model/provider: NVIDIA provider
  • Integration/channel (if any): N/A
  • Relevant config (redacted): no secrets required for public catalog proof

Steps

  1. Fetch NVIDIA's public featured-model JSON endpoint and inspect the returned status/count/first row.
  2. Run targeted NVIDIA provider tests.
  3. Run staged changed-surface validation.

Expected

  • NVIDIA's endpoint returns ranked featured model metadata.
  • Provider tests pass for live-row promotion, fallback behavior, and cache reuse.
  • Staged changed gate passes for extension/docs lanes.

Actual

  • NVIDIA endpoint returned status: 200, count: 4, first row nemotron-3-super-120b-a12b with context 262144 and max output 8192.
  • pnpm test extensions/nvidia/provider-catalog.test.ts extensions/nvidia/index.test.ts extensions/nvidia/onboard.test.ts passed: 3 files, 17 tests.
  • pnpm check:changed --staged passed for lanes extensions, extensionTests, docs.

Evidence

Attach at least one:

  • Failing test/log before + passing after
  • Trace/log snippets
  • Screenshot/recording
  • Perf numbers (if relevant)

Human Verification (required)

What you personally verified (not just CI), and how:

  • Verified scenarios: live NVIDIA endpoint shape, ranked featured-row promotion, fallback to bundled rows on HTTP failure, cache reuse, plugin augmentModelCatalog fallback and live rows, extension typecheck/lint/import-cycle changed gate.
  • Edge cases checked: malformed/unavailable endpoint path via HTTP 503 fallback; bare NVIDIA model id normalization from nemotron-... to nvidia/nemotron-...; duplicate fallback row suppression.
  • What you did not verify: full onboarding against a real NVIDIA API key.

Review Conversations

  • I replied to or resolved every bot review conversation I addressed in this PR.
  • I left unresolved only the conversations that still need reviewer or maintainer judgment.

If a bot review conversation is addressed by this PR, resolve that conversation yourself. Do not leave bot review conversation cleanup for maintainers.

Compatibility / Migration

  • Backward compatible? (Yes/No) Yes
  • Config/env changes? (Yes/No) No
  • Migration needed? (Yes/No) No
  • If yes, exact upgrade steps: N/A

Risks and Mitigations

  • Risk: NVIDIA's public catalog endpoint is temporarily unavailable, slow, or changes schema.
    • Mitigation: fetch uses a short timeout, validates the expected fields, caches successful results for 24 hours, and falls back to the bundled manifest catalog on any failure.

@openclaw-barnacle openclaw-barnacle Bot added docs Improvements or additions to documentation extensions: nvidia size: M triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. proof: supplied External PR includes structured after-fix real behavior proof. and removed triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. labels May 8, 2026
@clawsweeper

clawsweeper Bot commented May 8, 2026

Copy link
Copy Markdown
Contributor

Thanks for the context here. I swept through the related work, and this is now duplicate or superseded.

Close as superseded: the newer open implementation at #80775 covers the same NVIDIA featured-catalog work, credits this branch's author as co-author, and adds the HTTPS, bounded parsing, release cleanup, onboarding-test, and proof improvements missing here.

So I’m closing this here and keeping the remaining discussion on the canonical linked item.

Review details

Best possible solution:

Review and land one provider-owned NVIDIA live-catalog implementation through #80775, then close the linked feature request once that canonical path merges.

Do we have a high-confidence way to reproduce the issue?

Not applicable: this is a feature PR for live NVIDIA catalog discovery, not a broken existing behavior report. Source inspection confirms current main still uses static NVIDIA catalog rows, and both PR diffs show the intended live provider path.

Is this the best way to solve the issue?

No for this branch: the provider-owned live/static design is the right shape, but this PR is superseded by the newer open implementation with stronger network hardening, validation bounds, onboarding coverage, and sufficient real behavior proof.

Security review:

Security review needs attention: If this branch were merged, the new public NVIDIA catalog fetch would need the HTTPS and bounded-parsing hardening already present in the newer canonical PR.

  • [medium] Catalog fetch does not enforce HTTPS — extensions/nvidia/provider-catalog.ts:89
    The guarded fetch starts from an HTTPS URL but does not set requireHttps, so guarded redirects could still accept a same-host HTTP catalog response that influences promoted model rows.
    Confidence: 0.79
  • [medium] Public feed parsing is unbounded — extensions/nvidia/provider-catalog.ts:118
    The parser accepts every row and unbounded string/numeric fields from a third-party public JSON feed before exposing them as catalog models.
    Confidence: 0.76

What I checked:

Likely related people:

  • eleqtrizit: Merged PR history shows this account introduced the bundled NVIDIA provider, onboarding flow, docs, and static catalog in feat(nvidia): add NVIDIA provider with onboarding flow #71204, then opened the newer canonical live-catalog PR. (role: introduced NVIDIA provider behavior and canonical follow-up owner; confidence: high; commits: 00e5dbba6b7f, 935c44619241, 90b56ad40703; files: extensions/nvidia/provider-catalog.ts, extensions/nvidia/index.ts, docs/providers/nvidia.md)
  • shakkernerd: Merged PR fix: use static provider catalogs in models list --all #69909 added static provider catalog support and the buildStaticProvider seam that the live/static NVIDIA design relies on. (role: provider catalog architecture contributor; confidence: medium; commits: 18e0c3c02b7f, c931b01a22d4, 1dcce9d5b9b5; files: src/plugin-sdk/provider-entry.ts, src/plugins/provider-discovery.ts, docs/plugins/sdk-provider-plugins.md)
  • vincentkoc: Merged PR fix(nvidia): align NIM provider metadata #73733 updated NVIDIA provider metadata, API-key marker, and string-content compatibility shortly before this live-catalog work. (role: recent NVIDIA catalog metadata contributor; confidence: medium; commits: 7016c069b4eb; files: extensions/nvidia/provider-catalog.ts, extensions/nvidia/provider-catalog.test.ts, extensions/nvidia/openclaw.plugin.json)

Codex review notes: model gpt-5.5, reasoning high; reviewed against dc70841f8bcb.

@LambertArm LambertArm force-pushed the fix/nvidia-featured-models branch from 3e4aec9 to c8ea1cf Compare May 8, 2026 18:04
@LambertArm LambertArm force-pushed the fix/nvidia-featured-models branch from c8ea1cf to 7793c56 Compare May 8, 2026 18:49
@openclaw-barnacle openclaw-barnacle Bot added the gateway Gateway runtime label May 8, 2026
@clawsweeper clawsweeper Bot closed this May 12, 2026
@LambertArm

Copy link
Copy Markdown
Contributor Author

@clawsweeper why close my PR?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

docs Improvements or additions to documentation extensions: nvidia gateway Gateway runtime proof: supplied External PR includes structured after-fix real behavior proof. size: M

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature]: NVIDIA Featured Models API integration

1 participant