Skip to content

[CLI] Add unified hf cp command (aliased as hf repos cp and hf buckets cp)#4295

Merged
Wauplin merged 5 commits into
mainfrom
cli-unified-cp
Jun 1, 2026
Merged

[CLI] Add unified hf cp command (aliased as hf repos cp and hf buckets cp)#4295
Wauplin merged 5 commits into
mainfrom
cli-unified-cp

Conversation

@Wauplin

@Wauplin Wauplin commented May 29, 2026

Copy link
Copy Markdown
Collaborator

This PR adds a single, unified hf cp command to copy a files/folders between local and remote or between remote locations. Command is aliased as hf cp, hf buckets cp and hf repos cp with exact same behavior. EDIT: now added a guardrail to prevent using hf buckets cp on repos and vice-versa (see comments below #4295 (comment))

Supported

  • local -> remote ✔️ (upload single file)
  • remote -> local ✔️ (download single file)
  • repo -> repo ✔️
  • repo -> bucket ✔️
  • bucket -> repo ❌
  • bucket -> bucket ✔️

Local can be a local file or stdin/stdout. To sync/upload/download folders, use the more feature-complete hf sync/hf upload/hf download

Tests

Added tests/test_copy_files.py and moved all copy-related tests there: copy_files and CommitOperationCopy tests from test_hf_api.py, the CommitOperationCopy / _resolve_copy_target_path tests from test_commit_api.py, and the cp CLI tests from test_buckets_cli.py. Added new coverage for the repo legs (upload, download, stdout, @revision, repo→repo) and the aliases.

⚠️ ^ this was a design choice from me to keep all copy logic tests in a single location

Examples

Uploading a local file to a repo or a bucket:

$ hf cp ./config.json hf://Wauplin/tmp-for-test/demo/config.json
✓ Uploaded
  src: ./config.json
  dst: hf://models/Wauplin/tmp-for-test/demo/config.json

$ hf buckets cp ./config.json hf://buckets/Wauplin/tmp-for-test/demo/
✓ Uploaded
  src: ./config.json
  dst: hf://buckets/Wauplin/tmp-for-test/demo/config.json

Downloading, including to a directory or to stdout:

$ hf cp hf://Wauplin/tmp-for-test/demo/config.json ./downloaded/
✓ Downloaded
  src: hf://Wauplin/tmp-for-test/demo/config.json
  dst: ./downloaded/config.json

$ hf cp hf://Wauplin/tmp-for-test/demo/config.json -
{"hello": "world"}

Piping from stdin:

$ echo "hello from stdin" | hf cp - hf://buckets/Wauplin/tmp-for-test/demo/from-stdin.txt
✓ Uploaded
  src: stdin
  dst: hf://buckets/Wauplin/tmp-for-test/demo/from-stdin.txt

Remote-to-remote, and the same command via the repos alias doing a bucket→bucket copy:

$ hf cp hf://Wauplin/tmp-for-test/demo/config.json hf://buckets/Wauplin/tmp-for-test/copied-from-repo.json
✓ Copied
  src: hf://Wauplin/tmp-for-test/demo/config.json
  dst: hf://buckets/Wauplin/tmp-for-test/copied-from-repo.json

$ hf repos cp hf://buckets/Wauplin/tmp-for-test/demo/ hf://buckets/Wauplin/tmp-for-test/backup/
✓ Copied
  src: hf://buckets/Wauplin/tmp-for-test/demo/
  dst: hf://buckets/Wauplin/tmp-for-test/backup/

Note

Medium Risk
New user-facing CLI paths perform authenticated uploads, downloads, and server-side copies; mistakes could write to the wrong repo/bucket, though aliases and validation limit some misuse.

Overview
Introduces a unified hf cp command (shared implementation in cli/_cp.py) for copying a single file between local paths, repos, and buckets via hf:// URIs or stdin/stdout. The same handler is registered as hf repos cp and hf buckets cp, with context guardrails so those aliases reject the wrong remote type (e.g. bucket URIs under hf repos cp).

Behavior: local/stdin → repo or bucket upload; repo/bucket → local/stdout download (repos use a temp dir + os.replace beside the target); repo/bucket ↔ repo/bucket via HfApi.copy_files. Bucket→repo and local→local are rejected; directories still go through upload/download/sync.

Refactor: Removes the large inline cp implementation from buckets.py in favor of make_cp("buckets"); adds make_cp("repos") on the repos CLI.

Tests: New tests/test_copy_files.py consolidates copy-related API and CLI tests (moved from test_hf_api, test_commit_api, test_buckets_cli) and adds repo upload/download and alias validation. CI splits test_copy_files into the Xet-only job alongside bucket tests.

Docs: User guides and generated CLI reference document hf cp, alias equivalence, and a new Copy files section on the repository guide; bucket docs pivot examples to hf cp while noting aliases.

Reviewed by Cursor Bugbot for commit 9265993. Bugbot is set up for automated code reviews on this repo. Configure here.

…uckets cp`)

Add a single `cp` command that copies a file between any local path,
repository, or bucket (plus stdin/stdout), exposed identically as `hf cp`,
`hf repos cp` and `hf buckets cp`. Move all copy-related tests into a
dedicated `tests/test_copy_files.py` module and update the CLI, repository
and buckets guides.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@bot-ci-comment

Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@Wauplin Wauplin added the highlight PR will be highlighted in the release notes. label May 29, 2026
@Wauplin Wauplin requested a review from hanouticelina May 29, 2026 14:21
@Wauplin Wauplin marked this pull request as ready for review May 29, 2026 14:21
@julien-c

julien-c commented May 30, 2026

Copy link
Copy Markdown
Member

nice unification!

one thing on the aliasing: since hf cp, hf repos cp and hf buckets cp all point to the same handler and dispatch purely on the URI (is_bucket/type), the namespace is basically cosmetic — it only changes the help examples. so e.g. hf buckets cp hf://username/my-model/config.json . happily operates on a model repo, and hf repos cp will do a bucket→bucket copy.

is that intentional? wondering if hf buckets cp should reject non-bucket URIs (and vice versa) to avoid surprises, or if you'd rather keep it simple and permissive. fine either way, just want to make sure it's a conscious call 🙂

@Wauplin

Wauplin commented Jun 1, 2026

Copy link
Copy Markdown
Collaborator Author

is that intentional? wondering if hf buckets cp should reject non-bucket URIs (and vice versa) to avoid surprises, or if you'd rather keep it simple and permissive. fine either way, just want to make sure it's a conscious call 🙂

Yes it was intentional to keep things simple and I don't think it's problematic too be permissive on that. Happy to revisit if someone feels strongly about it (it's not that complex anyway^^)

@julien-c

julien-c commented Jun 1, 2026

Copy link
Copy Markdown
Member

Happy to revisit if someone feels strongly about it (it's not that complex anyway^^)

We* were feeling it would maybe remove some footguns if someone copies to an unexpected place. But we can do this in another PR i guess

*me and my agent

@Wauplin

Wauplin commented Jun 1, 2026

Copy link
Copy Markdown
Collaborator Author

We* were feeling it would maybe remove some footguns if someone copies to an unexpected place. But we can do this in another PR i guess

Addressed in dbb5963

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes using default effort and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit dbb5963. Configure here.

if context is None:
return
# The remote endpoint is the destination when it is an hf:// URI, otherwise the source (download).
remote = dst if (dst is not None and is_hf_uri(dst)) else src

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Context enforcement only checks one remote endpoint

Low Severity

_enforce_context only inspects a single remote endpoint — it prefers dst when it's an hf:// URI, and falls back to src otherwise. For remote-to-remote copies where both sides are hf:// URIs, only the destination is validated against the context. This means hf repos cp hf://buckets/user/bucket/file.txt hf://user/repo/file.txt passes the guardrail even though the source is a bucket. The operation still fails later in copy_files (bucket-to-repo is unsupported), but with a less helpful error message than the intended CLIError.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit dbb5963. Configure here.

@hanouticelina hanouticelina left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! ok to merge when the comment is addressed

Comment thread src/huggingface_hub/cli/_cp.py Outdated
Co-authored-by: célina <hanouticelina@gmail.com>
@Wauplin Wauplin merged commit 8295100 into main Jun 1, 2026
25 checks passed
@Wauplin Wauplin deleted the cli-unified-cp branch June 1, 2026 14:19
@huggingface-hub-bot

Copy link
Copy Markdown
Contributor

This PR has been shipped as part of the v1.18.0 release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

highlight PR will be highlighted in the release notes.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants