Skip to content

Add support for PyTorch and TensorFlow in expressions #509

@FrancescAlted

Description

@FrancescAlted

Right now, this does not work:

import blosc2
import numpy as np
import torch

N = 10
shape_a = (N, N, N)
matrix_numpy = np.ones(N ** 3).reshape(shape_a)
matrix_a = blosc2.asarray(matrix_numpy, urlpath="a.b2nd", mode="w")
matrix_b = blosc2.asarray(matrix_numpy, urlpath="b.b2nd", mode="w")
matrix_c = torch.ones(shape_a)

# Create a lazy expression object
sexpr = "matrix_a.sum() + matrix_b"  # this works
# lexpr = blosc2.lazyexpr(sexpr)  # this works
lexpr = blosc2.lazyexpr(sexpr, operands={"matrix_a": matrix_a, "matrix_b": matrix_c})  # this does not
print(lexpr[:])

this errors with:

Traceback (most recent call last):
  File "/Users/faltet/blosc/python-blosc2/torch-failure.py", line 15, in <module>
    lexpr = blosc2.lazyexpr(sexpr, operands={"matrix_a": matrix_a, "matrix_b": matrix_c})  # this does not
  File "/Users/faltet/blosc/python-blosc2/src/blosc2/lazyexpr.py", line 3477, in lazyexpr
    return LazyExpr._new_expr(expression, operands, guess=True, out=out, where=where, ne_args=ne_args)
           ~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/faltet/blosc/python-blosc2/src/blosc2/lazyexpr.py", line 3005, in _new_expr
    _operands[op] = blosc2.SimpleProxy(val)
                    ~~~~~~~~~~~~~~~~~~^^^^^
  File "/Users/faltet/blosc/python-blosc2/src/blosc2/proxy.py", line 613, in __init__
    self.chunks, self.blocks = blosc2.compute_chunks_blocks(
                               ~~~~~~~~~~~~~~~~~~~~~~~~~~~~^
        self.shape, chunks, blocks, self.dtype, **{"cparams": cparams}
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    )
    ^
  File "/Users/faltet/blosc/python-blosc2/src/blosc2/core.py", line 1512, in compute_chunks_blocks
    itemsize = cparams["typesize"] = np.dtype(dtype).itemsize
                                     ~~~~~~~~^^^^^^^
TypeError: Cannot interpret 'torch.float32' as a data type

And the same should work for TensorFlow.

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions