Skip to content

Add storage_utils#133524

Closed
jerryzh168 wants to merge 2 commits into
gh/jerryzh168/847/basefrom
gh/jerryzh168/847/head
Closed

Add storage_utils#133524
jerryzh168 wants to merge 2 commits into
gh/jerryzh168/847/basefrom
gh/jerryzh168/847/head

Conversation

@jerryzh168

@jerryzh168 jerryzh168 commented Aug 15, 2024

Copy link
Copy Markdown
Contributor

Stack from ghstack (oldest at bottom):

Summary:
Currently huggingface_hub, safetensors, accelerate all have their own implementation of
get_storage_id and get_storage_size, storage_ptr, which makes assumption on internal implementation details of torch.Tensor, and storage, and does not work for wrapper tensor subclasses

Motivated by huggingface/huggingface_hub#2440 (comment) maybe it makes more sense to add these as utils in pytorch so they can be maintained by us instead

This PR added
get_storage_id: returns a unique identifier for the tensor storage, for tensor subclasses, it returns a nested tuple of unique ids from underlying plain tensors
get_storage_size: returns the size in bytes for the underlying storage, for tensor subclasses, it returns the sum of the size from all underlying plain tensors

Test Plan:
python test/test_utils.py TestStorageUtils

Reviewers:

Subscribers:

Tasks:

Tags:

Summary:
Currently [huggingface_hub](https://github.com/huggingface/huggingface_hub/blob/main/src/huggingface_hub/serialization/_torch.py), [safetensors](https://github.com/huggingface/safetensors/blob/main/bindings/python/py_src/safetensors/torch.py#L11), [accelerate](https://github.com/huggingface/accelerate/blob/a452327e8e04b20779882dc491e00de602d554cb/src/accelerate/utils/modeling.py#L175) all have their own implementation of
`get_storage_id` and `get_storage_size`, `storage_ptr`, which makes assumption on internal implementation details of torch.Tensor, and storage, and does not work for wrapper tensor subclasses

Motivated by huggingface/huggingface_hub#2440 (comment) maybe it makes more sense to add these as utils in pytorch so they can be maintained by us instead

This PR added
`get_storage_id`: returns a unique identifier for the tensor storage, for tensor subclasses, it returns a nested tuple of unique ids from underlying plain tensors
`get_storage_size`: returns the size in bytes for the underlying storage, for tensor subclasses, it returns the sum of the size from all underlying plain tensors

Test Plan:
python test/test_utils.py TestStorageUtils

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
@pytorch-bot

pytorch-bot Bot commented Aug 15, 2024

Copy link
Copy Markdown

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/133524

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit b788e5b with merge base d46e076 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@jerryzh168 jerryzh168 requested a review from albanD August 15, 2024 00:16
Summary:
Currently [huggingface_hub](https://github.com/huggingface/huggingface_hub/blob/main/src/huggingface_hub/serialization/_torch.py), [safetensors](https://github.com/huggingface/safetensors/blob/main/bindings/python/py_src/safetensors/torch.py#L11), [accelerate](https://github.com/huggingface/accelerate/blob/a452327e8e04b20779882dc491e00de602d554cb/src/accelerate/utils/modeling.py#L175) all have their own implementation of
`get_storage_id` and `get_storage_size`, `storage_ptr`, which makes assumption on internal implementation details of torch.Tensor, and storage, and does not work for wrapper tensor subclasses

Motivated by huggingface/huggingface_hub#2440 (comment) maybe it makes more sense to add these as utils in pytorch so they can be maintained by us instead

This PR added
`get_storage_id`: returns a unique identifier for the tensor storage, for tensor subclasses, it returns a nested tuple of unique ids from underlying plain tensors
`get_storage_size`: returns the size in bytes for the underlying storage, for tensor subclasses, it returns the sum of the size from all underlying plain tensors

Test Plan:
python test/test_utils.py TestStorageUtils

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
jerryzh168 added a commit that referenced this pull request Aug 15, 2024
Summary:
Currently [huggingface_hub](https://github.com/huggingface/huggingface_hub/blob/main/src/huggingface_hub/serialization/_torch.py), [safetensors](https://github.com/huggingface/safetensors/blob/main/bindings/python/py_src/safetensors/torch.py#L11), [accelerate](https://github.com/huggingface/accelerate/blob/a452327e8e04b20779882dc491e00de602d554cb/src/accelerate/utils/modeling.py#L175) all have their own implementation of
`get_storage_id` and `get_storage_size`, `storage_ptr`, which makes assumption on internal implementation details of torch.Tensor, and storage, and does not work for wrapper tensor subclasses

Motivated by huggingface/huggingface_hub#2440 (comment) maybe it makes more sense to add these as utils in pytorch so they can be maintained by us instead

This PR added
`get_storage_id`: returns a unique identifier for the tensor storage, for tensor subclasses, it returns a nested tuple of unique ids from underlying plain tensors
`get_storage_size`: returns the size in bytes for the underlying storage, for tensor subclasses, it returns the sum of the size from all underlying plain tensors

Test Plan:
python test/test_utils.py TestStorageUtils

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: 87d1bd5
Pull Request resolved: #133524
@jerryzh168

Copy link
Copy Markdown
Contributor Author

we'll support sharding properly through distributed checkpointing (https://pytorch.org/docs/stable/distributed.checkpoint.html), distributed checkpointing team is following up with huggingface on the concrete plans

@jerryzh168 jerryzh168 closed this Sep 9, 2024
@github-actions github-actions Bot deleted the gh/jerryzh168/847/head branch October 12, 2024 02:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant