-
Notifications
You must be signed in to change notification settings - Fork 555
NumPy dtype weirdness breaks np.copy for vlen datasets (was: 64-bit problem (3rd party libs)) #217
Description
Original author: andrew.c...@gmail.com (September 22, 2011 01:03:32)
Via email, referring to the libs at
http://www.lfd.uci.edu/~gohlke/pythonlibs/
The code to reproduce is below, the last two statements fail. Although it looks like numpy fails, I have tested multiple versions and once I start using fixed length strings as datatype it works fine.
The unique statement fails with:
Traceback (most recent call last):
File "...test2.py", line 23, in <module>
print np.unique(dset['text'])
File "...Python_x64\lib\site-packages\numpy\lib\arraysetops.py", line 160, in unique
ar = ar.flatten()
ValueError: low level cast function is for unequal type numbers
The copy statement with:
Traceback (most recent call last):
File "...test2.py", line 24, in <module>
print np.copy(dset)
File "...Python_x64\lib\site-packages\numpy\lib\function_base.py", line 818, in copy
return array(a, copy=True)
ValueError: low level cast function is for unequal type numbers
Another issue I can reproduce is that sparse selection of very big char datatypes causes the program to hang, see TEST_DTYPE_hangs.
Thanks for any help with that! And by the way: a great package for hdf5!!!!
Chris
import h5py
import numpy as np
hfile = '...DB_test.hdf5'
fd = h5py.File(hfile, 'w')
TEST_DTYPE = np.dtype([('text', h5py.special_dtype(vlen=str)),
('id','<i8')])
TEST_DTYPE_working = np.dtype([('text', 'a10000'),
('id','<i8')])
TEST_DTYPE_hangs = np.dtype([('text', 'a100000000'),
('id','<i8')])
testdata = np.array(
[('Text a',2),
('Text a',1),
('Text b',2)],
dtype=TEST_DTYPE)
create groups
grp = fd.create_group('/TEST')
create dataset
dset = grp.create_dataset('testdata', dtype=TEST_DTYPE, maxshape=(None,), data=testdata, chunks=True)
print len(dset)
print np.unique(dset['text'])
print np.copy(dset)
Original issue: http://code.google.com/p/h5py/issues/detail?id=217