Skip to content

XetSession API#657

Merged
seanses merged 15 commits intomainfrom
di/session-interface
Mar 4, 2026
Merged

XetSession API#657
seanses merged 15 commits intomainfrom
di/session-interface

Conversation

@seanses
Copy link
Collaborator

@seanses seanses commented Feb 13, 2026

Derived from this plan

This PR introduces a new xet_session crate that provides a session-based hierarchical API: Users create a XetSession to manage runtime and configuration, then batch uploads into UploadCommit objects and downloads into DownloadGroup objects — each of which runs transfers in the background by the inner XetRuntime.

All pub functions are exposed as sync functions - making them easy to use in other languages, e.g. Python, C, ...

Will expose another interface for Rust async native environment.

Example

let session = XetSession::new(XetConfig::new(...));

let upload_commit = session.new_upload_commit();

let handle = upload_commit.upload_file_from_path(file_path); // return immediately
let handle = upload_commit.upload_bytes(bytes); // return immediately

let progress = upload_commit.get_progress(); // return a collection of upload progress

let result = upload_commit.commit(); // blocking wait: finalizing upload, pushes all data to remote, return a collection of file metadata (hash and size)

let download_group = session.new_download_group();
let handle = download_group.download_file_to_path(file_hash, dest_path); // return immediately

let progress = download_group.get_progress(); // return a collection of download progress

let result = download_group.finish() // blocking wait: return a collection of download results (metadata)

Next steps

  • Make progress reporting pull-based not push-based: introduces a get_process() function to FileUploadSession and FileDownloadSession so XetSession can pull.
  • Plug into Xet migration service to test.

@seanses seanses changed the title draft XetSession API Feb 25, 2026
@seanses seanses force-pushed the di/session-interface branch from 210aced to 248f584 Compare February 25, 2026 01:19
@seanses seanses changed the base branch from main to di/change_tracking_id_to_ulid February 26, 2026 20:57
@seanses seanses marked this pull request as ready for review February 26, 2026 21:03
Copy link
Collaborator

@hoytak hoytak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per discussion, I think we should get this in and then iterate rather than hold it up longer. Many todos, but let's merge and then update it.

@seanses seanses changed the base branch from di/change_tracking_id_to_ulid to main March 3, 2026 01:30
Copy link
Contributor

@assafvayner assafvayner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments.

I do have 2 additional requests for this revision, then more content can go in in the future.

  • Add a README.md in xet_session, should be easy just generate and review. This should describe the general interface exposed and the motivation.
  • ^ should fix this, If you run cargo doc on this crate, the first page is pretty empty, so we should fill something in there, doesn't have to be final/release version, just something.

@rajatarya
Copy link
Collaborator

Code Review Summary

Well-structured session API with a clean hierarchical design (XetSession → UploadCommit/DownloadGroup). The sync-over-async approach is appropriate for FFI consumers.

Existing review feedback already covers: inner struct visibility/Deref pattern, builder pattern for XetSession, std::sync::Mutex usage, and README/docs. These findings are omitted below.

P0 — Must Fix

  1. Example will deadlock#[tokio::main] + commit() (which calls external_run_async_taskblock_on) deadlocks because you can't block_on inside a Tokio runtime.
  2. handle_commit/handle_finish abandon remaining tasks on first error?? propagates immediately, dropping un-awaited JoinHandles. Remaining tasks continue running after commit()/finish() returns.

P1 — Should Fix

  1. HashMap::drain() returns results in non-deterministic ordercommit()/finish() return Vec<FileMetadata>/Vec<DownloadResult> in arbitrary order. Use BTreeMap<Ulid> (time-ordered) or document.
  2. abort() fails partway on PoisonError — If status.lock()? fails mid-iteration, remaining tasks are left un-cancelled. XetSession::abort() also returns early on the first failing sub-abort.
  3. file_size field assigned wrong value in downloadresult? is n_bytes downloaded, not the logical file size. Should use file_info.file_size.

P2 — Nice to Fix

  1. crate-type includes cdylib/staticlib — triples compile time unless FFI is needed now
  2. Double Arc wrapping of XetRuntimeXetRuntime is already internally Arc-wrapped
  3. Silent progress drops on poisoned locks in register_updates
  4. Download concurrency semaphore moved to caller with only a comment as the contract

See inline comments for exact locations of P0/P1 issues.

Copy link
Collaborator

@rajatarya rajatarya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inline comments on high-priority issues.

Copy link
Collaborator

@rajatarya rajatarya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't block on feedback from Claude - but I think the P0/P1 feedback sounds reasonable to address - either prior to merging or after merging in a subsequent PR.

@seanses seanses merged commit c4a56f8 into main Mar 4, 2026
7 checks passed
@seanses seanses deleted the di/session-interface branch March 4, 2026 04:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants