Skip to content

Add agent harnesses registry to @huggingface/tasks#2209

Merged
Wauplin merged 4 commits into
mainfrom
add-agent-harnesses-registry
Jun 3, 2026
Merged

Add agent harnesses registry to @huggingface/tasks#2209
Wauplin merged 4 commits into
mainfrom
add-agent-harnesses-registry

Conversation

@Wauplin

@Wauplin Wauplin commented Jun 2, 2026

Copy link
Copy Markdown
Contributor

Agentic use of the Hub is growing, and we'd like to make it visible. When huggingface_hub detects it is run by an agent, it sends the info via the user agent on every http request. At the moment, detection is based on a hardcoded list inside huggingface_hub. This PR moves the source of truth to @huggingface/tasks, mirroring how we already register local apps, model and dataset libraries.

Goal is to be able to update the agent harness list without requiring a client-side update..

Client side the detection process will be:

  • iterate over AGENT_HARNESSES
  • for each entry, check if any env var is set and follow the pattern (e.g. envVars: { ANTIGRAVITY_AGENT: "*" },)
    • if envVars is not set or doesn't match, we check AGENT and AI_AGENT env variables for the exact harness id
    • => return on first match

If no entry matches but one of AGENT/AI_AGENT is set, we set the harness to unknown.

cc @davanstrien @hanouticelina @julien-c with whom we discussed that recently


Note: for now docs/repo urls, pretty name and description are not used but the plan is to build a lightweight leaderboard with the collected data. Since we will ask the community to register new harnesses themselves, it's best to require all the information right now.

Warning

Roo-code and Gemini CLI detection have been removed. Roo code is now an archived repo on Github (project has been stopped) and Gemini CLI is deprecated in favor of Antigravity.


Note

Low Risk
Adds static registry metadata and public exports only; no runtime Hub or auth behavior changes in this repo.

Overview
Centralizes AI agent / harness detection metadata in @huggingface/tasks so clients like huggingface_hub can consume a shared registry instead of a hardcoded list.

Adds packages/tasks/src/agent-harnesses.ts with an AgentHarness type, STANDARD_AGENT_ENV_VARS (AI_AGENT, AGENT), and AGENT_HARNESSES — a keyed catalog of known tools with optional envVars patterns (*, exact match, or prefix match) and metadata for a future leaderboard. Order in the object matters for first-match detection (e.g. cowork before claude-code, cursor-cli before cursor). devin is listed without envVars (relies on standard agent env vars only).

Re-exports AGENT_HARNESSES, STANDARD_AGENT_ENV_VARS, and types from packages/tasks/src/index.ts. Adds CODEOWNERS for the new file.

Reviewed by Cursor Bugbot for commit a49be10. Bugbot is set up for automated code reviews on this repo. Configure here.

Single mapping of AI coding agents/harnesses known to use the Hub, so that
new harnesses can be registered here rather than hardcoded in huggingface_hub.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
docsUrl: "https://devin.ai",
description: "Autonomous AI software engineer from Cognition.",
},
} satisfies Record<string, AgentHarness>;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Registry missing roo-code and gemini harness entries

Medium Severity

The AGENT_HARNESSES registry is missing entries for roo-code (detected via ROO_ACTIVE) and gemini (detected via GEMINI_CLI), both of which are present in the upstream Python _detect_agent.py source. The PR describes itself as a "faithful port" and "behavior-preserving move rather than a change in coverage," and it includes the pi entry that was added after roo-code and gemini — so these omissions appear unintentional. When the Python client switches to reading from this registry as its source of truth, requests from Roo Code and Gemini CLI users will stop being attributed.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 950c1ea. Configure here.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes it's on purpose (archived and/or deprecated projects)

@Wauplin Wauplin marked this pull request as ready for review June 2, 2026 12:36
@davanstrien

Copy link
Copy Markdown
Member

thanks! Won't comment on the ts code but happy to add some docs for updating this in hub-docs. It's probably fairly obvious but might be worth a small note somewhere IMO.

@Wauplin

Wauplin commented Jun 2, 2026

Copy link
Copy Markdown
Contributor Author

Won't comment on the ts code but happy to add some docs for updating this in hub-docs. It's probably fairly obvious but might be worth a small note somewhere IMO.

Yes I'm actually working on that part as well. Will ping you on the PR soon :)

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes using default effort and found 1 potential issue.

There are 2 total unresolved issues (including 1 from previous review).

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 61b2af9. Configure here.

Comment thread packages/tasks/src/agent-harnesses.ts

@davanstrien davanstrien left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks! excited for this!

When the Cursor CLI runs inside the Cursor editor's integrated terminal,
child processes inherit CURSOR_TRACE_ID and the CLI also sets CURSOR_AGENT.
With first-match-wins detection, the more specific cursor-cli must be
checked before cursor so it isn't masked by the editor signal.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@Wauplin

Wauplin commented Jun 2, 2026

Copy link
Copy Markdown
Contributor Author

Docs-related PR: huggingface/hub-docs#2521 (to be merged after this one)

@hanouticelina hanouticelina left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, thank you!

@Wauplin Wauplin merged commit 037a6a9 into main Jun 3, 2026
7 checks passed
@Wauplin Wauplin deleted the add-agent-harnesses-registry branch June 3, 2026 11:14

@julien-c julien-c left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

late to the party, but very cool!

* The value pattern is one of:
* - `"*"`: the variable is set to any (non-empty) value
* - `"<value>"`: the variable equals this exact value
* - `"<prefix>*"`: the variable value starts with `<prefix>` (fuzzy match, resolved client-side)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the use case for this last case?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

none yet, I'll remove it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants