[CLI] Expose linked repos in PaperInfo#4240
Conversation
Adds linked_models, num_total_models, linked_datasets, num_total_datasets, and linked_spaces to PaperInfo. The Hub's GET /api/papers/:paperId now returns these fields by default (huggingface-internal/moon-landing#18235). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
The failing Verified locally by stashing this PR's diff and running |
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
| project_page: str | None | ||
| github_repo: str | None | ||
| github_stars: int | None | ||
| linked_models: list[dict] | None |
There was a problem hiding this comment.
linked_models must be a list[ModelInfo] which is the dataclass returned by model_info / list_models. Can you update the type + docstring + initialization + test + PR description example accordingly? Thanks!
(same for datasets and Spaces)
|
|
||
| class PaperInfoParseTest(unittest.TestCase): | ||
| def test_paper_info_parses_linked_repos(self) -> None: | ||
| from huggingface_hub.hf_api import PaperInfo |
There was a problem hiding this comment.
import must be at a module level
| paper = PaperInfo( | ||
| id="2502.08025", | ||
| linkedModels=[{"id": "user/test-model"}], | ||
| numTotalModels=1, | ||
| linkedDatasets=[{"id": "user/test-dataset"}], | ||
| numTotalDatasets=1, | ||
| linkedSpaces=[{"id": "user/test-space"}], | ||
| ) | ||
| assert paper.linked_models == [{"id": "user/test-model"}] |
There was a problem hiding this comment.
here we are just testing that the dataclass works as a dataclass. We should test the paper_info endpoint instead i.e. make a real paper_info("2601.15621") call on production endpoint and check the returned values
| def test_paper_info_linked_repos_default_to_none(self) -> None: | ||
| from huggingface_hub.hf_api import PaperInfo | ||
|
|
||
| paper = PaperInfo(id="2502.08025") | ||
| assert paper.linked_models is None | ||
| assert paper.num_total_models is None | ||
| assert paper.linked_datasets is None | ||
| assert paper.num_total_datasets is None | ||
| assert paper.linked_spaces is None |
- Type linked_models/linked_datasets/linked_spaces as list[ModelInfo] /
list[DatasetInfo] / list[SpaceInfo] and parse them in PaperInfo.__init__
- Replace dataclass unit tests with a real paper_info('2601.15621') call
on the production endpoint
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
This PR has been shipped as part of the v1.16.0 release. |
Summary
related to #3952
GET /api/papers/{id}now returns linked repos by default. This PR surfaces them onPaperInfo:linked_models: list[dict] | Nonenum_total_models: int | Nonelinked_datasets: list[dict] | Nonenum_total_datasets: int | Nonelinked_spaces: list[dict] | NoneExample
returns (truncated):
{ "id": "2601.15621", "linkedModels": [ { "id": "Qwen/Qwen3-TTS-12Hz-1.7B-CustomVoice", "downloads": 1556731, "likes": 1491, ... }, { "id": "Qwen/Qwen3-TTS-12Hz-1.7B-Base", "downloads": 1954426, "likes": 393, ... } ], "numTotalModels": 261, "linkedDatasets": [ { "id": "Izzyzlin/CFSDD", "downloads": 667, ... } ], "numTotalDatasets": 1, "linkedSpaces": [ { "id": "HuggingFaceM4/faster-qwen3-tts-demo", "emoji": "🎙", "running": true, "featured": true } ] }After this PR:
Test plan
PaperInfoParseTestintests/test_hf_api.pycovering the new fields (both populated and default-None cases).Note
Low Risk
Low risk: adds new optional fields to
PaperInfoand parsing logic for additional API response keys, plus a focused regression test.Overview
PaperInfonow surfaces linked Hub repos returned byGET /api/papers/{id}by adding optionallinked_models/linked_datasets/linked_spaceslists (parsed intoModelInfo/DatasetInfo/SpaceInfo) and accompanyingnum_total_models/num_total_datasetscounters.Adds a production test ensuring
paper_infopopulates these new fields when present in the API response.Reviewed by Cursor Bugbot for commit 93d4088. Bugbot is set up for automated code reviews on this repo. Configure here.