Skip to content

DataFrameSerializer doesn't support pandas NullableDataTypes #76

@AndyBryson

Description

@AndyBryson

It's nice to be able to keep a data item as (e.g.) an integer and allow it to have missing values.

from influxdb_client_3.write_client.client.write.dataframe_serializer import DataframeSerializer
from influxdb_client_3 import PointSettings
import pandas as pd

# make an example dataframe with nullable dtypes
df = pd.DataFrame({
    "bool": [True, False, None],
    "int": [1, 2, None],
    "float": [1.0, 2.0, None],
    "str": ["a", "b", None],
})
df['bool'] = df['bool'].astype(pd.BooleanDtype())
df['int'] = df['int'].astype(pd.Int64Dtype())
df['float'] = df['float'].astype(pd.Float64Dtype())
df['str'] = df['str'].astype(pd.StringDtype())

df.index = pd.to_datetime(["2021-01-01", "2021-01-02", "2021-01-03"])

ps = PointSettings()

serializer = DataframeSerializer(df, point_settings=ps, precision="ms",  data_frame_measurement_name="test")

lines = serializer.serialize()

Gives TypeError: boolean value of NA is ambiguous due to _not_nan. If we fix that then there are other issues that must be fixed in the string formatting code.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions