[Agent] Dynamic agent harness registry#4325
Conversation
Move the source of truth for AI agent harness detection from a hardcoded list to the Hub's `/api/agent-harnesses` endpoint, fetched at most once a day and cached locally. No hardcoded fallback: detection is best-effort and degrades to "no agent" when the registry can't be fetched. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
…x agent test - detect_agent: use `or [...]` instead of `.get(default)` so explicit `null` values in the (cached/fetched) registry degrade gracefully instead of raising. - test_auto_resolves_to_agent: pin a registry that recognizes AI_AGENT (conftest pins an empty registry by default). - Add a test covering the malformed-registry (null values) case. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
| # No harness matched but a standard var is set => unrecognized agent. | ||
| for var in standard_vars: | ||
| if os.environ.get(var, "").strip(): | ||
| return "unknown" |
There was a problem hiding this comment.
i thought this could potentially be the name of the harness (name if name in _KNOWN_AGENTS else "unknown") but not sure if this is actually used in real life
There was a problem hiding this comment.
yes I'll keep the name if name in _KNOWN_AGENTS else "unknown" logic which preexisted this PR. I did not review yet^^
Add `Registry` / `HarnessInfo` TypedDicts and type the registry functions accordingly. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes using default effort and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit c087fb5. Configure here.
| if value := os.environ.get(var, "").strip().lower(): | ||
| if value in lowercased_harnesses: | ||
| return value | ||
| return "unknown" |
There was a problem hiding this comment.
Malformed registry crashes detection
Medium Severity
detect_agent assumes the cached or fetched registry is always a dict with a dict-shaped harnesses and dict-shaped envVars. JSON that is valid but wrongly typed (for example a list harnesses or non-mapping envVars) is stored and can make detect_agent raise, despite the module stating detection must never fail the process.
Additional Locations (2)
Reviewed by Cursor Bugbot for commit c087fb5. Configure here.
Co-authored-by: célina <hanouticelina@gmail.com>
|
This PR has been shipped as part of the v1.19.0 release. |


Follow-up to huggingface/huggingface.js#2209 (registry in
@huggingface/tasks) and the newGET /api/agent-harnessesendpoint (see https://github.com/huggingface-internal/moon-landing/pull/18454).Until now, agent harness detection relied on a hardcoded list in
_detect_agent.py. This moves the source of truth to the Hub so the list can be updated without a client release.With this PR:
detect_agentnow fetches the registry from/api/agent-harnesses, at most once a day, cached on disk (newAGENT_HARNESSES_PATH, same spirit asCHECK_FOR_UPDATE_DONE_PATH). The result is also cached in-process so hot paths (is_agent()for ANSI coloring / CLI output mode) don't pay repeatedly.envVarsmatching from the registry:"*"(set to any value) and exact value match.🤖 Generated with Claude Code
Note
Low Risk
Best-effort telemetry with short HTTP timeout, graceful degradation when offline or on errors, and no change to auth or data paths; main behavioral shift is no hardcoded fallback when the registry is unavailable.
Overview
Agent harness detection no longer uses a hardcoded list in
_detect_agent.py. It loads harness definitions from the HubGET /api/agent-harnesses, with a 24-hour on-disk cache atAGENT_HARNESSES_PATHand an in-process cache so repeatedis_agent()/ CLI output checks stay cheap.Matching follows the registry: per-harness
envVarswith"*"(any non-empty value) or exact values, plus standardAI_AGENT/AGENTids (case-insensitive for unknown vs known). First match wins by registry order. If fetch fails and there is no cache, detection reports no agent; errors are swallowed so detection never breaks the process. Offline mode skips the fetch.Tests pin an empty registry in
conftest, addtest_utils_detect_agent.pyfor detection and cache behavior, and adjust CLI output tests to override the registry when simulating agents.Reviewed by Cursor Bugbot for commit c087fb5. Bugbot is set up for automated code reviews on this repo. Configure here.