Skip to content

feat(hub): add copyFiles API for remote file copy between buckets and repositories#2121

Merged
coyotte508 merged 8 commits into
mainfrom
cursor/copy-files-api-4d0b
May 13, 2026
Merged

feat(hub): add copyFiles API for remote file copy between buckets and repositories#2121
coyotte508 merged 8 commits into
mainfrom
cursor/copy-files-api-4d0b

Conversation

@Wauplin

@Wauplin Wauplin commented Apr 24, 2026

Copy link
Copy Markdown
Contributor

Summary

Implements the "copy files remotely" API in @huggingface/hub, porting the Python HfApi.copy_files functionality to TypeScript/JS.

This enables the "Copy to Bucket" feature on the Hub UI, allowing instant server-side file copy between buckets and from repositories to buckets.

Supported operations

Source Destination Mechanism
Bucket Bucket Server-side copy by xet hash (no data transfer)
Repo (model/dataset/space) with xet files Bucket Server-side copy by xet hash (no data transfer)
Repo with non-xet files (small git files) Bucket Download + re-upload via commit()

Not supported (yet)

  • Bucket → Repo copy
  • Repo → Repo copy

Features

  • copyFiles() — main function, exported from @huggingface/hub
  • parseHfCopyHandle() — parses hf:// handles (buckets, models, datasets, spaces, with @revision support)
  • Single file and recursive folder copy
  • Automatic destination path resolution (file vs directory target)
  • Batched server-side copy via POST /api/buckets/{id}/batch with NDJSON copyFile operations
  • Fallback download+upload path for non-xet repo files using existing commit() infrastructure

Usage

import { copyFiles } from "@huggingface/hub";

// Copy a single file between buckets
await copyFiles({
  source: "hf://buckets/my-bucket/data.bin",
  destination: "hf://buckets/other-bucket/data.bin",
  accessToken: "hf_...",
});

// Copy a folder from a bucket to another bucket
await copyFiles({
  source: "hf://buckets/my-bucket/models/",
  destination: "hf://buckets/other-bucket/backup/",
  accessToken: "hf_...",
});

// Copy from a model repo to a bucket
await copyFiles({
  source: "hf://models/username/my-model/model.safetensors",
  destination: "hf://buckets/my-bucket/",
  accessToken: "hf_...",
});

// Copy an entire dataset to a bucket
await copyFiles({
  source: "hf://datasets/username/my-dataset/",
  destination: "hf://buckets/my-bucket/datasets/",
  accessToken: "hf_...",
});

Reference

Python implementation: huggingface/huggingface_hub#3874

How to test locally

Unit tests (handle parsing)

cd packages/hub
pnpm test -- --testPathPattern copy-files

The parseHfCopyHandle unit tests run without any network access.

Integration tests (require CI Hub access)

The integration tests (copyFiles describe block) run against the CI Hub at https://hub-ci.huggingface.co. They:

  1. Create temporary source/destination repos (bucket and/or model)
  2. Upload test files
  3. Run copyFiles
  4. Verify files appear in the destination
  5. Clean up repos

To run them:

cd packages/hub

# Set up test credentials (the tests use TEST_ACCESS_TOKEN from src/test/consts.ts)
pnpm test -- --testPathPattern copy-files

Manual testing

You can also test manually against the production Hub:

import { copyFiles } from "@huggingface/hub";

// Copy a public model's files to your bucket
await copyFiles({
  source: "hf://models/openai-community/gpt2",
  destination: "hf://buckets/your-username/your-bucket/models/gpt2/",
  accessToken: "hf_YOUR_TOKEN",
});

Slack Thread

Open in Web Open in Cursor 

… repositories

Implements the 'copy files remotely' API in @huggingface/hub, porting the
Python huggingface_hub.HfApi.copy_files functionality to TypeScript/JS.

Supports:
- Bucket-to-bucket copy (server-side, no data transfer)
- Repo (model/dataset/space) to bucket copy
  - xet-backed files: server-side copy by hash
  - non-xet files: download + re-upload via commit
- Single file and recursive folder copy
- hf:// handle parsing with revision support

Reference: huggingface/huggingface_hub#3874

Co-authored-by: Lucain <Wauplin@users.noreply.github.com>
@Wauplin

Wauplin commented Apr 24, 2026

Copy link
Copy Markdown
Contributor Author

closing in favor of @julien-c 's PR to come

@coyotte508

Copy link
Copy Markdown
Member

hmm I'll keep the hf://buckets/... for CLI if needed, not for function

@coyotte508

Copy link
Copy Markdown
Member

split between copyFile / copyFiles / copyFolder

@coyotte508 coyotte508 marked this pull request as ready for review May 13, 2026 13:15
@coyotte508 coyotte508 self-requested a review as a code owner May 13, 2026 13:15
Comment thread packages/hub/src/lib/commit.ts
Comment thread packages/hub/src/lib/copy-files.ts

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit f8cc5a9. Configure here.

state: "error",
};
}
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy failures silently swallowed when batch partially succeeds

Medium Severity

When the copy batch API returns HTTP 422, individual copy failures are yielded as fileProgress error events but processing continues to the delete-operations phase. The commit() wrapper at line 1017 collects these errors and throws, but the error message says "Failed to upload N file(s)" which is misleading for copy operations. More importantly, the addFile batch (line 888) uses the same pattern — on partial failure, both the successfully added files AND successfully copied files persist in the bucket while the caller receives an error. This means a partial copy state is left in the destination bucket with no rollback, and retrying the entire copyFiles call would re-process all files rather than just the failed ones.

Additional Locations (1)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit f8cc5a9. Configure here.

Comment thread packages/hub/src/lib/copy-files.ts
@coyotte508 coyotte508 merged commit da6cc0c into main May 13, 2026
6 of 7 checks passed
@coyotte508 coyotte508 deleted the cursor/copy-files-api-4d0b branch May 13, 2026 14:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants