ENH: Implement the DLPack Array API protocols for ndarray.#19083
ENH: Implement the DLPack Array API protocols for ndarray.#19083seberg merged 18 commits intonumpy:mainfrom
Conversation
d111d23 to
248a695
Compare
|
Thanks Hameer, looks like a good start. Of the points @seberg brought up on the issue, adding a version attribute seems the most important - because checking for version will be needed before any backwards-incompatible change (such as adding an extra field) can be done. I asked about it on dmlc/dlpack#34, and the suggestion was to add it as an attribute on the |
ac6c005 to
5aa28d8
Compare
Sure -- I commented there instead of here, so more of the relevant parties can see it. |
BvB93
left a comment
There was a problem hiding this comment.
Hi Hameer, could you also add these two new methods to the ndarray stubs in numpy/__init__.pyi?
Their signatures are fairly simple, so that's a plus:
# NOTE: Every single import below is already present in `__init__.pyi`
# and doesn't have to be repeated
from typing import Any, Tuple
from typing_extensions import Literal as L
from numpy import number
from numpy.typing import NDArray
# `builtins.PyCapsule` unfortunately lacks annotations as of the moment;
# use `Any` as a stopgap measure
_PyCapsule = Any
def __dlpack__(self: NDArray[number[Any]], *, stream: None = ...) -> _PyCapsule: ...
def __dlpack_device__(self) -> Tuple[L[1], L[0]]: ...77e49cc to
830903a
Compare
seberg
left a comment
There was a problem hiding this comment.
Had started looking at it, so a few small comments. It looks good, I have to think a bit more about the API, but that has nothing to do with the implementation.
I think if we put the header where you put it, it will be effectively public. I guess that is fine thoug? Otherwise I think placing it into common is good.
| # `builtins.PyCapsule` unfortunately lacks annotations as of the moment; | ||
| # use `Any` as a stopgap measure | ||
| _PyCapsule = Any | ||
|
|
There was a problem hiding this comment.
Interestingly, if we could get builtins.PyCapsule in typeshed annotated as a parameterizable type,
then it would in principle be possible for static type checkers to read, e.g., the underlying shape and dtype.
The only thing that users or libraries would have to do here is declare the necessary annotations.
Perhaps something to consider for the future?
Examples
from typing import TypeVar, Any, Generic, Tuple as Tuple
import numpy as np
# Improvised `PyCapsule` annotation
_T = TypeVar("_T")
class PyCapsule(Generic[_T]): ...
# Construct a more compact `PyCapsule` alias; `Tuple` used herein to introduce 2 parameters
# (there may be more appropriate types that can fulfill this functionality)
_Shape = TypeVar("_Shape", bound=Any) # TODO: Wait for PEP 646's TypeVarTuple
_DType = TypeVar("_DType", bound=np.dtype[Any])
DLPackCapsule = PyCapsule[Tuple[_Shape, _DType]]
# A practical example
def from_dlpack(__x: DLPackCapsule[_Shape, _DType]) -> np.ndarray[_Shape, _DType]: ...There was a problem hiding this comment.
I'll consider this as out of scope of this PR for now, but will leave the conversation unresolved for visibility.
a5d1adf to
becdf4d
Compare
There's a standardized name, see data-apis/array-api#186. I have modified the PR accordingly.
Unaligned is OK, see #19013 (comment). How does one check for native dtypes?
Can we verify this? |
835c638 to
e6d2195
Compare
Ignore, it does mean IEEE float. Does |
|
|
|
I guess if longdouble is 64bit, you can be sure its just double. That we could check for if we want. There are a couple of macros in Technically, most current hardware uses something like 80bit (IEEE?) stored as 96bit or 128bit. But DLPack can't describe it (its an old thing that our |
You can use |
If `byte_offset = 0` is forced anyway, there is no point in trying to preserve a previous `data` information from the capsule. (And probably it should have used a base array also, and not just a base DLPack capsule, anyway.)
|
I will put this in once the tests pass, but will then open a few issues about dlpack, with pretty harmless bugs or general followups. The release notes get reviewed before the release anyway. Thanks everyone and especially Matti for doing the follow-ups. |
|
🎉 thanks @mattip, @hameerabbasi, @seberg, @leofang & all other reviewers! |
|
Silly question: Can someone remind me why we renamed it to |
|
This was pushed in a ~1 week before branching with not quite settled things in dlpack itself (as I remember). Adding the underscore was the way to not having to discuss all of that in a short time frame. |
|
I see, thanks for the context @seberg. Sounds like a decision made offline (which I am fine with). Is there any plan to make it a public API (by removing the underscore)? |
|
Yes, we can discuss this obviously. It may have been discussed in a meeting, in which case it is publically noted in the |
|
The only mention of dlpack in the docs is in the release notes, so maybe we could discuss making it public together with some documentation. |
|
Thanks, guys. I created an issue to track this change: #20743. |
|
Does E.g. PyTorch supports both raw capsule input and an object having Original issue: My original DLpack python/numpy bindings: |
|
@vadimkantorov If understand correctly, direct capsules are tricky and shouldn't be passed around by library consumers. The current recommendation is that libraries should only support other array objects in |
|
@hameerabbasi PyTorch's https://pytorch.org/docs/stable/dlpack.html#torch.utils.dlpack.to_dlpack actually returns an opaque Capsule object... What would be nice is having an example of producing a DLPack struct from C function (via ctypes) or PyCapsule from PyTorch - and then consuming it from NumPy - to showcase the interop situations I did hacks for some usecases like this in https://github.com/vadimkantorov/pydlpack and https://github.com/vadimkantorov/readaudio |
|
@vadimkantorov |
|
The usecase where DLPack is used for marshalling an array from C land (function can be bound with DLPack) into a NumPy tensor is also very useful: plain C functions with DLPack structures is much simpler to compile / manage than building torch/numpy extensions In my example I used DLPack-producing C functions doing audio decoding, and then ingesting such a DLPack ctypes-bound structure in NumPy. Does NumPy also provide such ctypes binding types for DLPack structures? |
|
It doesn't. But if you already have a raw capsule, it should be quite easy to wrap it in a thin pure Python class with |
|
For some reason (maybe inspired with original PyTorch support), NVidia's Triton Inference Server API for |
Fixes #19013
cc @rgommers @mattip @seberg
TODO
from_dlpackfunction.__dlpack__TODO items added/edited by seberg:
methods.cdlpack.hthat clarifies alignment requirements (and use ofbyte_offset). After such a PR exists, we can review the current rules when to reject export. (if the rejection is not guaranteed to be fixed byarr.copy()we may have a problem.)from_dlpackshall not be exposed as new, public API. There will be no public API to convert to NumPy from__dlpack__. It would be ok to have experimental/private, underscored API.dlpack.hor array-api that states that imported arrays shall be considered writable and exporters like NumPy should avoid exporting readonly arrays (truly readonly only?). Ideally, with a footnote that this means that users of immutable array libraries may be up for a big surprise.(mattip says:) Not sure how to resolve this. The data api issue is still open
->deleter == NULLcase is settled (I think it is, but maybe there needs to be some clarification in the header as well).