Skip to content

[CLI] Expose linked repos in PaperInfo#4240

Merged
Wauplin merged 2 commits into
mainfrom
expose-paper-linked-repos
May 20, 2026
Merged

[CLI] Expose linked repos in PaperInfo#4240
Wauplin merged 2 commits into
mainfrom
expose-paper-linked-repos

Conversation

@mishig25

@mishig25 mishig25 commented May 20, 2026

Copy link
Copy Markdown
Contributor

Summary

related to #3952

GET /api/papers/{id} now returns linked repos by default. This PR surfaces them on PaperInfo:

  • linked_models: list[dict] | None
  • num_total_models: int | None
  • linked_datasets: list[dict] | None
  • num_total_datasets: int | None
  • linked_spaces: list[dict] | None

Example

curl -s https://huggingface.co/api/papers/2601.15621

returns (truncated):

{
  "id": "2601.15621",
  "linkedModels": [
    { "id": "Qwen/Qwen3-TTS-12Hz-1.7B-CustomVoice", "downloads": 1556731, "likes": 1491, ... },
    { "id": "Qwen/Qwen3-TTS-12Hz-1.7B-Base", "downloads": 1954426, "likes": 393, ... }
  ],
  "numTotalModels": 261,
  "linkedDatasets": [
    { "id": "Izzyzlin/CFSDD", "downloads": 667, ... }
  ],
  "numTotalDatasets": 1,
  "linkedSpaces": [
    { "id": "HuggingFaceM4/faster-qwen3-tts-demo", "emoji": "🎙", "running": true, "featured": true }
  ]
}

After this PR:

>>> from huggingface_hub import HfApi
>>> paper = HfApi().paper_info("2601.15621")
>>> paper.num_total_models
261
>>> paper.linked_models[0].id
'Qwen/Qwen3-TTS-12Hz-1.7B-CustomVoice'
>>> paper.linked_spaces[0].id
'HuggingFaceM4/faster-qwen3-tts-demo'
$ hf papers info 2601.15621 | jq ".linked_models[].id"
"Qwen/Qwen3-TTS-12Hz-1.7B-CustomVoice"
"Qwen/Qwen3-TTS-12Hz-1.7B-Base"
"Qwen/Qwen3-TTS-12Hz-1.7B-VoiceDesign"
"Qwen/Qwen3-TTS-12Hz-0.6B-Base"

Test plan

  • Added PaperInfoParseTest in tests/test_hf_api.py covering the new fields (both populated and default-None cases).

Note

Low Risk
Low risk: adds new optional fields to PaperInfo and parsing logic for additional API response keys, plus a focused regression test.

Overview
PaperInfo now surfaces linked Hub repos returned by GET /api/papers/{id} by adding optional linked_models/linked_datasets/linked_spaces lists (parsed into ModelInfo/DatasetInfo/SpaceInfo) and accompanying num_total_models/num_total_datasets counters.

Adds a production test ensuring paper_info populates these new fields when present in the API response.

Reviewed by Cursor Bugbot for commit 93d4088. Bugbot is set up for automated code reviews on this repo. Configure here.

Adds linked_models, num_total_models, linked_datasets, num_total_datasets,
and linked_spaces to PaperInfo. The Hub's GET /api/papers/:paperId now
returns these fields by default (huggingface-internal/moon-landing#18235).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@mishig25

Copy link
Copy Markdown
Contributor Author

The failing check_code_quality job is unrelated to this PR — it's ty check src reporting 3 pre-existing errors in src/huggingface_hub/_webhooks_server.py (lines 38, 91, 304). The Python quality workflow is currently failing on main with the same errors.

Verified locally by stashing this PR's diff and running ty check src on clean main — same 3 errors.

@bot-ci-comment

Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Comment thread src/huggingface_hub/hf_api.py Outdated
project_page: str | None
github_repo: str | None
github_stars: int | None
linked_models: list[dict] | None

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

linked_models must be a list[ModelInfo] which is the dataclass returned by model_info / list_models. Can you update the type + docstring + initialization + test + PR description example accordingly? Thanks!

(same for datasets and Spaces)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

handled in 93d4088

Comment thread tests/test_hf_api.py Outdated

class PaperInfoParseTest(unittest.TestCase):
def test_paper_info_parses_linked_repos(self) -> None:
from huggingface_hub.hf_api import PaperInfo

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

import must be at a module level

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

handled in 93d4088

Comment thread tests/test_hf_api.py Outdated
Comment on lines +4541 to +4549
paper = PaperInfo(
id="2502.08025",
linkedModels=[{"id": "user/test-model"}],
numTotalModels=1,
linkedDatasets=[{"id": "user/test-dataset"}],
numTotalDatasets=1,
linkedSpaces=[{"id": "user/test-space"}],
)
assert paper.linked_models == [{"id": "user/test-model"}]

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here we are just testing that the dataclass works as a dataclass. We should test the paper_info endpoint instead i.e. make a real paper_info("2601.15621") call on production endpoint and check the returned values

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

handled in 93d4088

Comment thread tests/test_hf_api.py Outdated
Comment on lines +4555 to +4563
def test_paper_info_linked_repos_default_to_none(self) -> None:
from huggingface_hub.hf_api import PaperInfo

paper = PaperInfo(id="2502.08025")
assert paper.linked_models is None
assert paper.num_total_models is None
assert paper.linked_datasets is None
assert paper.num_total_datasets is None
assert paper.linked_spaces is None

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need for this test

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

handled in 93d4088

- Type linked_models/linked_datasets/linked_spaces as list[ModelInfo] /
  list[DatasetInfo] / list[SpaceInfo] and parse them in PaperInfo.__init__
- Replace dataclass unit tests with a real paper_info('2601.15621') call
  on the production endpoint

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

@Wauplin Wauplin left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

(also tested in CLI

$ hf papers info 2601.15621 | jq ".linked_models[].id"
"Qwen/Qwen3-TTS-12Hz-1.7B-CustomVoice"
"Qwen/Qwen3-TTS-12Hz-1.7B-Base"
"Qwen/Qwen3-TTS-12Hz-1.7B-VoiceDesign"
"Qwen/Qwen3-TTS-12Hz-0.6B-Base"

)

@Wauplin Wauplin merged commit a117e68 into main May 20, 2026
20 of 21 checks passed
@Wauplin Wauplin deleted the expose-paper-linked-repos branch May 20, 2026 09:52
@huggingface-hub-bot

Copy link
Copy Markdown
Contributor

This PR has been shipped as part of the v1.16.0 release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants