[CLI] Add file listing to models/datasets/spaces ls#4166
Conversation
When called with a repo ID, 'hf models ls', 'hf datasets ls', and 'hf spaces ls' now list files in the corresponding repo, matching the behavior of 'hf buckets ls <bucket_id>'. Supports --tree, -R (recursive), -h (human-readable), and --revision. Shared file listing helpers are factored into _file_listing.py, and buckets.py is refactored to use them too. Co-authored-by: Lucain <Wauplin@users.noreply.github.com>
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
The tree_bucket fixture was patching the now-removed _format_mtime in buckets.py. Updated to patch format_date in _file_listing.py. Co-authored-by: Lucain <Wauplin@users.noreply.github.com>
Use @with_production_testing against real repos: - t5-small (model): JSON, quiet, tree, recursive outputs - rajpurkar/squad (dataset): JSON output - gradio/theme_builder (space): JSON output Co-authored-by: Lucain <Wauplin@users.noreply.github.com>
Wauplin
left a comment
There was a problem hiding this comment.
✔️ (should be ready for review)
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 25a82b2. Configure here.
| if search is not None: | ||
| raise typer.BadParameter("Cannot use --search when listing files.") | ||
| if author is not None: | ||
| raise typer.BadParameter("Cannot use --author when listing files.") | ||
| if filter is not None: | ||
| raise typer.BadParameter("Cannot use --filter when listing files.") | ||
| if num_parameters is not None: | ||
| raise typer.BadParameter("Cannot use --num-parameters when listing files.") | ||
| if sort is not None: | ||
| raise typer.BadParameter("Cannot use --sort when listing files.") | ||
| if limit != 10: | ||
| raise typer.BadParameter("Cannot use --limit when listing files.") | ||
| if expand is not None: | ||
| raise typer.BadParameter("Cannot use --expand when listing files.") |
There was a problem hiding this comment.
this logic is largely duplicated between models, datasets and spaces but I didn't find a nice way to factorize it while been easy to read so I kept the duplicated logic
| expand: ExpandOpt = None, | ||
| human_readable: Annotated[ | ||
| bool, | ||
| typer.Option("--human-readable", "-h", help="Show sizes in human readable format (only for listing files)."), |
There was a problem hiding this comment.
already pre-existing in buckets and I didn't notice before, but -h will collide with --help, no?
There was a problem hiding this comment.
not a problem IMO (the -h for human-readable takes precedence)
|
This PR has been shipped as part of the v1.13.0 release. |

Context: we can list tree from buckets but not from repos. This PR adds support for this.
For reviewers: most of the code is a move from the existing code in
buckets.pyto a "repo/bucket-agnostic version". For tests, I've added a few ones for model repos for the main use cases but only 1 for datasets/spaces (same logic anyway).Note for a follow-up PR: when too many files (>1000?) IMO we should truncate the output and put a warning "Output has been truncated. Pass --full to get full list". This would be a change for both buckets and repos so I think it's out of scope for this PR.
Summary
hf models ls,hf datasets ls, andhf spaces lscan now list files from an individual repo when called with a repo ID matching the existing behavior ofhf buckets ls <bucket_id>. When a positionalrepo_idargument is given, thelscommand switches from "list repos" to "list files in that repo" mode.Supports the same options as bucket file listing:
--tree,-R(recursive),-h(human-readable sizes), plus--revision.Examples:
Slack Thread