Skip to content

[CLI] Add hf models card and hf datasets card commands#4118

Merged
Wauplin merged 6 commits into
huggingface:mainfrom
davanstrien:feat/cli-card-command
Apr 27, 2026
Merged

[CLI] Add hf models card and hf datasets card commands#4118
Wauplin merged 6 commits into
huggingface:mainfrom
davanstrien:feat/cli-card-command

Conversation

@davanstrien

@davanstrien davanstrien commented Apr 16, 2026

Copy link
Copy Markdown
Member

Summary

  • Adds hf models card <model_id> and hf datasets card <dataset_id> commands that print the repo card (README) to stdout
  • Three output modes: full card (default), --metadata (just the YAML frontmatter as JSON), --text (just the markdown body)
  • --metadata and --text are mutually exclusive

Motivation

hf models info gives you structured Hub metadata (downloads, tags, pipeline_tag, siblings, etc.) but not the human-authored card content. The card text is where you find the stuff that info doesn't surface: usage examples with actual code, training details, known limitations, intended use cases, benchmark results with context, and architecture descriptions. Put simply — info tells you what a model is, card tells you how to use it and why.

info does include a card_data field, but it's the raw YAML string, not parsed. --metadata returns the same data as structured JSON via out.dict(), so it works with --format and is easy to pipe into jq or consume programmatically.

Agents and humans can already get card content via hf download <repo_id> README.md or curl, but that writes to a file and gives you the raw README with no way to split the YAML frontmatter from the prose. hf models card outputs directly to stdout and the --metadata/--text flags let you grab just the part you need.

For agents specifically, having a low-friction way to read model documentation helps reduce hallucination. Agents tend to default to recommending models they've memorised from training data (often outdated — e.g. still reaching for early Llama models), and fabricate usage details rather than checking the actual card. A single command that returns the real card content makes it easy for agents to look things up rather than guess. This is particularly valuable for newer models that post-date the agent's training cutoff.

For humans, it's a quick way to check a model's docs from the terminal without opening a browser — useful when comparing models or scripting.

Examples

# Full card to stdout
$ hf models card google/gemma-4-31B-it

# Just the card metadata (from the YAML frontmatter)
$ hf models card google/gemma-4-31B-it --metadata

# Card metadata as JSON
$ hf models card google/gemma-4-31B-it --metadata --format json
{"library_name": "transformers", "license": "apache-2.0", ...}

# Pretty-printed
$ hf models card google/gemma-4-31B-it --metadata --format human

# Just the text body (no YAML frontmatter)
$ hf models card google/gemma-4-31B-it --text

# Same for datasets
$ hf datasets card HuggingFaceFW/fineweb --metadata --format human

Design notes

  • --metadata not --yaml — We considered --yaml (the source format) and --frontmatter (the structural term) but went with --metadata because it describes what you're extracting rather than where it lives. It also pairs cleanly with --text — both flags describe the kind of content you want. And it avoids confusion with --format, which controls output format: --metadata --format json reads clearly as "give me the metadata, formatted as JSON".
  • No --revision supportRepoCard.load() doesn't currently pass revision to hf_hub_download. Could be added to RepoCard.load() in a follow-up and then wired through here.
  • --format is accepted even though the default and --text modes output free-form text (where --format json produces no output, same as hf papers read). We kept it because --metadata goes through out.dict() and genuinely benefits from it (e.g. --format human for pretty-printed JSON). This follows the majority CLI pattern — hf papers read is the only command that omits --format.

🤖 Generated with Claude Code


Note

Low Risk
Low risk: adds new read-only CLI subcommands that fetch and print repo card content, with minimal impact on existing command behavior.

Overview
Adds new hf models card and hf datasets card CLI subcommands to fetch a repo card (README) and print it to stdout, with --metadata (YAML frontmatter parsed to structured JSON via out.dict) or --text (markdown body only) modes and a mutual-exclusion check.

Updates the CLI docs/reference to document these new commands and adds CLI tests covering full/metadata/text outputs and invalid flag combinations.

Reviewed by Cursor Bugbot for commit 524fc2c. Bugbot is set up for automated code reviews on this repo. Configure here.

Add commands to fetch model/dataset cards (README) from the Hub with
three output modes: full card (default), --metadata (YAML frontmatter
as JSON), and --text (markdown body only).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@bot-ci-comment

Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@codecov

codecov Bot commented Apr 16, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 83.33333% with 5 lines in your changes missing coverage. Please review.
✅ Project coverage is 77.01%. Comparing base (1daa48b) to head (aba4878).
⚠️ Report is 272 commits behind head on main.

Files with missing lines Patch % Lines
src/huggingface_hub/cli/datasets.py 80.00% 3 Missing ⚠️
src/huggingface_hub/cli/models.py 86.66% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4118      +/-   ##
==========================================
+ Coverage   75.00%   77.01%   +2.00%     
==========================================
  Files         145      167      +22     
  Lines       13978    18948    +4970     
==========================================
+ Hits        10484    14592    +4108     
- Misses       3494     4356     +862     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@Wauplin Wauplin left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the addition. Could you also add hf spaces card? Can be useful to quickly check the Space metadata

Comment thread src/huggingface_hub/cli/models.py Outdated
Comment thread src/huggingface_hub/cli/datasets.py Outdated
Comment thread tests/test_cli.py
assert kwargs["sort"] == "downloads"


class TestModelsCardCommand:

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you replace all tests with real world ones e.g.

def test_models_card_full(self, runner: CliRunner) -> None:
    result = runner.invoke(app, ["models", "card", "Qwen/Qwen3.6-35B-A3B"])
    assert "library_name: transformers" in result.stdout
    assert "# Qwen3.6-35B-A3B" in result.stdout

?

no mocks, no need to check exit code, makes the whole test more readable IMO

Comment thread tests/test_cli.py Outdated
davanstrien and others added 2 commits April 27, 2026 13:32
Co-authored-by: Lucain <lucainp@gmail.com>
Co-authored-by: Lucain <lucainp@gmail.com>
@davanstrien davanstrien marked this pull request as draft April 27, 2026 12:36
davanstrien and others added 3 commits April 27, 2026 14:22
- Add `hf spaces card` command to complete the models/datasets/spaces trio
- Replace mocked unit tests for models/datasets card with single live tests
  using @with_production_testing (Wauplin's preferred pattern)
- Add live test for spaces card
- Document hf spaces card in CLI guide
- Regenerate package_reference/cli.md

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The previous examples used enzostvs/deepsite (consistent with hf spaces info)
but that Space has no public README, so `hf spaces card enzostvs/deepsite`
returns 404. Switch examples, docs, and the live test to mteb/leaderboard,
which has a public card. Also tighten the dataset live-test assertion
to check for the body heading rather than just "FineWeb" (which appears
in both YAML and body).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@davanstrien

Copy link
Copy Markdown
Member Author

cc @Wauplin Updated tests and added support for Spaces cards.

@davanstrien davanstrien marked this pull request as ready for review April 27, 2026 13:45

@Wauplin Wauplin left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@Wauplin Wauplin merged commit a259ecb into huggingface:main Apr 27, 2026
12 of 16 checks passed
@huggingface-hub-bot

Copy link
Copy Markdown
Contributor

This PR has been shipped as part of the v1.13.0 release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants