Skip to content

ENH: Implement the DLPack Array API protocols for ndarray.#19083

Merged
seberg merged 18 commits intonumpy:mainfrom
hameerabbasi:dlpack
Nov 9, 2021
Merged

ENH: Implement the DLPack Array API protocols for ndarray.#19083
seberg merged 18 commits intonumpy:mainfrom
hameerabbasi:dlpack

Conversation

@hameerabbasi
Copy link
Copy Markdown
Contributor

@hameerabbasi hameerabbasi commented May 24, 2021

Fixes #19013

cc @rgommers @mattip @seberg

TODO

  • from_dlpack function.
  • dtype support in __dlpack__

TODO items added/edited by seberg:

  • Move all dlpack specific code to a single C file, rather than methods.c
  • I would feel much better if there is a PR to dlpack.h that clarifies alignment requirements (and use of byte_offset). After such a PR exists, we can review the current rules when to reject export. (if the rejection is not guaranteed to be fixed by arr.copy() we may have a problem.)
  • As discussed in the meeting from_dlpack shall not be exposed as new, public API. There will be no public API to convert to NumPy from __dlpack__. It would be ok to have experimental/private, underscored API.
  • (preferably) Make sure there is a PR to either dlpack.h or array-api that states that imported arrays shall be considered writable and exporters like NumPy should avoid exporting readonly arrays (truly readonly only?). Ideally, with a footnote that this means that users of immutable array libraries may be up for a big surprise.
    (mattip says:) Not sure how to resolve this. The data api issue is still open
  • I am not quite sure if the ->deleter == NULL case is settled (I think it is, but maybe there needs to be some clarification in the header as well).
  • Specify cleanup to know for a fact whether this implementation or the CuPy implementation is correct.

@hameerabbasi hameerabbasi requested review from mattip and seberg May 24, 2021 13:26
@hameerabbasi hameerabbasi changed the title Add the __dlpack__ and __dlpack_device__ methods to ndarray. Implement the DLPack Array API protocols for ndarray. May 24, 2021
@hameerabbasi hameerabbasi force-pushed the dlpack branch 3 times, most recently from d111d23 to 248a695 Compare May 24, 2021 14:39
@rgommers
Copy link
Copy Markdown
Member

Thanks Hameer, looks like a good start.

Of the points @seberg brought up on the issue, adding a version attribute seems the most important - because checking for version will be needed before any backwards-incompatible change (such as adding an extra field) can be done. I asked about it on dmlc/dlpack#34, and the suggestion was to add it as an attribute on the PyCapsule. Could you look into doing that?

@hameerabbasi hameerabbasi force-pushed the dlpack branch 2 times, most recently from ac6c005 to 5aa28d8 Compare May 24, 2021 15:50
@hameerabbasi
Copy link
Copy Markdown
Contributor Author

Could you look into doing that?

Sure -- I commented there instead of here, so more of the relevant parties can see it.

Copy link
Copy Markdown
Member

@BvB93 BvB93 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Hameer, could you also add these two new methods to the ndarray stubs in numpy/__init__.pyi?
Their signatures are fairly simple, so that's a plus:

# NOTE: Every single import below is already present in `__init__.pyi`
# and doesn't have to be repeated
from typing import Any, Tuple
from typing_extensions import Literal as L

from numpy import number
from numpy.typing import NDArray

# `builtins.PyCapsule` unfortunately lacks annotations as of the moment;
# use `Any` as a stopgap measure
_PyCapsule = Any

def __dlpack__(self: NDArray[number[Any]], *, stream: None = ...) -> _PyCapsule: ...
def __dlpack_device__(self) -> Tuple[L[1], L[0]]: ...

@hameerabbasi hameerabbasi force-pushed the dlpack branch 2 times, most recently from 77e49cc to 830903a Compare May 24, 2021 16:12
Copy link
Copy Markdown
Member

@seberg seberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had started looking at it, so a few small comments. It looks good, I have to think a bit more about the API, but that has nothing to do with the implementation.

I think if we put the header where you put it, it will be effectively public. I guess that is fine thoug? Otherwise I think placing it into common is good.

Comment on lines +1647 to +1439
# `builtins.PyCapsule` unfortunately lacks annotations as of the moment;
# use `Any` as a stopgap measure
_PyCapsule = Any

Copy link
Copy Markdown
Member

@BvB93 BvB93 May 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interestingly, if we could get builtins.PyCapsule in typeshed annotated as a parameterizable type,
then it would in principle be possible for static type checkers to read, e.g., the underlying shape and dtype.

The only thing that users or libraries would have to do here is declare the necessary annotations.

Perhaps something to consider for the future?

Examples

from typing import TypeVar, Any, Generic, Tuple as Tuple
import numpy as np

# Improvised `PyCapsule` annotation
_T = TypeVar("_T")
class PyCapsule(Generic[_T]): ...

# Construct a more compact `PyCapsule` alias; `Tuple` used herein to introduce 2 parameters 
# (there may be more appropriate types that can fulfill this functionality)
_Shape = TypeVar("_Shape", bound=Any)  # TODO: Wait for PEP 646's TypeVarTuple
_DType = TypeVar("_DType", bound=np.dtype[Any])
DLPackCapsule = PyCapsule[Tuple[_Shape, _DType]]

# A practical example
def from_dlpack(__x: DLPackCapsule[_Shape, _DType]) -> np.ndarray[_Shape, _DType]: ...

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll consider this as out of scope of this PR for now, but will leave the conversation unresolved for visibility.

@hameerabbasi hameerabbasi force-pushed the dlpack branch 2 times, most recently from a5d1adf to becdf4d Compare May 25, 2021 11:24
@hameerabbasi
Copy link
Copy Markdown
Contributor Author

hameerabbasi commented May 25, 2021

This name should be some standardized DLPack name? Or is this intentional for now to keep it effectively private right now?

Well, we have to be able to delete it to free the memory, but maybe we need some state as well, to ensure that we don't delete it twice if the consumer already deletes it without deleting the capsule (might be good for non-reference counted Python also?).

I thought I read that there was some capsule renaming involved (to signal that its "invalid" after deletion). I guess setting the data to NULL or so would be just as well. (If this is standardized, do that. If not setting data to NULL to guard against deleting twice? Once from python once from the consumer.)

There's a standardized name, see data-apis/array-api#186. I have modified the PR accordingly.

We need to also check for the DType being native. I assume unaligned is OK?

Unaligned is OK, see #19013 (comment). How does one check for native dtypes?

Maybe we should just explicitly list them? We cannot include longdouble here, since the C99 standard doesn't specify it (I am not even sure IBM double-double is strictly standard conform). And I assume Float in DLPack means IEEE compatible float format of defined size. float16, float32, and float64 are fine, for the others maybe DLPack should add an extended float field or so, if we want it?

I just don't see longdoublel to be much good, considering its already system dependend on the CPU alone...

Can we verify this?

@hameerabbasi hameerabbasi force-pushed the dlpack branch 4 times, most recently from 835c638 to e6d2195 Compare May 25, 2021 11:41
@hameerabbasi
Copy link
Copy Markdown
Contributor Author

hameerabbasi commented May 25, 2021

Can we verify this?

Ignore, it does mean IEEE float. Does np.float80 or np.float128 not respect the IEEE convention? In that case, it makes sense to limit both complex and floats. Otherwise, the receiving library can deny the incoming datatype by reading the bits field.

@hameerabbasi hameerabbasi requested a review from seberg May 25, 2021 12:33
@eric-wieser
Copy link
Copy Markdown
Member

np.float80 etc are all just aliases for np.longdouble based on what C produces for sizeof(long double)

@seberg
Copy link
Copy Markdown
Member

seberg commented May 25, 2021

I guess if longdouble is 64bit, you can be sure its just double. That we could check for if we want. There are a couple of macros in numpy/core/src/common/npy_fpmath.h such as HAVE_LDOUBLE_IEEE_QUAD_BE. I don't know how well they work/are tested, but I guess technically you could check for that to catch the rare cases where we have proper quad precision.

Technically, most current hardware uses something like 80bit (IEEE?) stored as 96bit or 128bit. But DLPack can't describe it (its an old thing that our float128 should relaly be float128_80 or something else than float128...). Something, that will eventually be important if quad precision becomes a real thing.

@seberg
Copy link
Copy Markdown
Member

seberg commented May 25, 2021

How does one check for native dtypes?

You can use PyDataType_ISNOTSWAPPED.

If `byte_offset = 0` is forced anyway, there is no point in trying
to preserve a previous `data` information from the capsule.
(And probably it should have used a base array also, and not just
a base DLPack capsule, anyway.)
@seberg
Copy link
Copy Markdown
Member

seberg commented Nov 9, 2021

I will put this in once the tests pass, but will then open a few issues about dlpack, with pretty harmless bugs or general followups. The release notes get reviewed before the release anyway.

Thanks everyone and especially Matti for doing the follow-ups.

@seberg seberg merged commit 0de29c5 into numpy:main Nov 9, 2021
@rgommers
Copy link
Copy Markdown
Member

rgommers commented Nov 9, 2021

🎉 thanks @mattip, @hameerabbasi, @seberg, @leofang & all other reviewers!

@leofang
Copy link
Copy Markdown
Contributor

leofang commented Jan 3, 2022

Silly question: Can someone remind me why we renamed it to np._from_dlpack() (prefixed with an underscore)?

@seberg
Copy link
Copy Markdown
Member

seberg commented Jan 4, 2022

This was pushed in a ~1 week before branching with not quite settled things in dlpack itself (as I remember). Adding the underscore was the way to not having to discuss all of that in a short time frame.

@leofang
Copy link
Copy Markdown
Contributor

leofang commented Jan 4, 2022

I see, thanks for the context @seberg. Sounds like a decision made offline (which I am fine with). Is there any plan to make it a public API (by removing the underscore)?

@seberg
Copy link
Copy Markdown
Member

seberg commented Jan 4, 2022

Yes, we can discuss this obviously. It may have been discussed in a meeting, in which case it is publically noted in the numpy/archive repo. I feel the PR had the underscore initially, because the first step was to make it available for the _array_api namespace.

@mattip
Copy link
Copy Markdown
Member

mattip commented Jan 4, 2022

The only mention of dlpack in the docs is in the release notes, so maybe we could discuss making it public together with some documentation.

@leofang
Copy link
Copy Markdown
Contributor

leofang commented Jan 5, 2022

Thanks, guys. I created an issue to track this change: #20743.

@vadimkantorov
Copy link
Copy Markdown

Does numpy._from_dlpack support construction from a raw capsule?

E.g. PyTorch supports both raw capsule input and an object having __dlpack__ instance method:
https://pytorch.org/docs/stable/generated/torch.from_dlpack.html:
If ext_tensor is a tensor (or ndarray) object, it must support the __dlpack__ protocol (i.e., have a ext_tensor.__dlpack__ method). Otherwise ext_tensor may be a DLPack capsule, which is an opaque PyCapsule instance, typically produced by a to_dlpack function or method.


Original issue:

My original DLpack python/numpy bindings:

@hameerabbasi
Copy link
Copy Markdown
Contributor Author

@vadimkantorov If understand correctly, direct capsules are tricky and shouldn't be passed around by library consumers.

The current recommendation is that libraries should only support other array objects in from_dlpack.

@vadimkantorov
Copy link
Copy Markdown

vadimkantorov commented Mar 11, 2025

@hameerabbasi PyTorch's https://pytorch.org/docs/stable/dlpack.html#torch.utils.dlpack.to_dlpack actually returns an opaque Capsule object... What would be nice is having an example of producing a DLPack struct from C function (via ctypes) or PyCapsule from PyTorch - and then consuming it from NumPy - to showcase the interop situations

I did hacks for some usecases like this in https://github.com/vadimkantorov/pydlpack and https://github.com/vadimkantorov/readaudio

@rgommers
Copy link
Copy Markdown
Member

@vadimkantorov torch.utils.dlpack.to_dlpack is legacy, just don't use it. PyTorch tensors have a __dlpack__ method, so you can simply use np.from_dlpack(a_tensor).

@vadimkantorov
Copy link
Copy Markdown

The usecase where DLPack is used for marshalling an array from C land (function can be bound with DLPack) into a NumPy tensor is also very useful: plain C functions with DLPack structures is much simpler to compile / manage than building torch/numpy extensions

In my example I used DLPack-producing C functions doing audio decoding, and then ingesting such a DLPack ctypes-bound structure in NumPy. Does NumPy also provide such ctypes binding types for DLPack structures?

@rgommers
Copy link
Copy Markdown
Member

It doesn't. But if you already have a raw capsule, it should be quite easy to wrap it in a thin pure Python class with __dlpack__ and __dlpack_device__ methods - that should be all you need.

@vadimkantorov
Copy link
Copy Markdown

vadimkantorov commented Mar 19, 2025

For some reason (maybe inspired with original PyTorch support), NVidia's Triton Inference Server API for to_dlpack also returns capsule objects:
https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/python_backend/README.html#interoperability-and-gpu-support

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

01 - Enhancement component: numpy._core triage review Issue/PR to be discussed at the next triage meeting

Projects

None yet

Development

Successfully merging this pull request may close these issues.

DLPack support for NumPy

9 participants