Skip to content

Change in dtype returned in h5py v2.10.0 #1307

@ivirshup

Description

@ivirshup

Using *nix operating systems on python 3.6 and 3.7, the dtype of fixed length ascii strings stored in compound dtypes changed from h5py v2.9 to v2.10.

Here's a small example, writing using v2.9 (though it doesn't seem to matter which I've written with):

import h5py
import numpy as np
import pandas as pd

d = pd.DataFrame({"a": [b"abc", b"def"]})
with h5py.File("test.h5", "a") as f:
    f["x"] = d.to_records(column_dtypes={"a": "S3"})

Now reading. Here's the code used:

import h5py
with h5py.File("test.h5", "r") as f:
    print(f["x"].dtype.descr)

Reading with h5py v2.9 prints:

[('index', '<i8'), ('a', '|S3')]

v2.10:

[('index', '<i8'), ('a', ('|S3', {'h5py_encoding': 'ascii'}))]

It's very possible this is intentional, and was done for good reasons. It is a change in behavior though, and I think should at least documented be in the change notes.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions