-
-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Closed
Labels
dataframegood second issueClearly described, educational, but less trivial than "good first issue".Clearly described, educational, but less trivial than "good first issue".
Description
When attempting to convert a column from one dtype to another using astype, there is a FutureWarning raised by Dask and Pandas regarding the move to using view.
Pandas suggests replacing the astype method with view, however view appears not to be implemented in dask.dataframe.core.
This is the FutureWarning from Pandas:
FutureWarning: casting datetime64[ns] values to int64 with .astype(...) is deprecated and will raise in a future version. Use .view(...) instead.
df['TIMESTAMP_astype'] = df['TIMESTAMP'].astype('int64')
Setup:
import pandas as pd
import dask.dataframe as dd
data = {
'TIMESTAMP': [
'2021-11-27 00:05:02.175274',
'2021-11-27 00:05:05.205596',
'2021-11-27 00:05:29.212572',
'2021-11-27 00:05:25.708343',
'2021-11-27 00:05:47.714958',
]
}
df = pd.DataFrame(data)
df['TIMESTAMP'] = pd.to_datetime(df['TIMESTAMP'])
ddf = dd.from_pandas(df, npartitions=1)
ddf['TIMESTAMP'] = dd.to_datetime(ddf['TIMESTAMP'])
Pandas code:
df['TIMESTAMP_astype'] = df['TIMESTAMP'].astype('int64') # works - throws FutureWarning
df['TIMESTAMP_view'] = df['TIMESTAMP'].view('int64') # works
assert (df['TIMESTAMP_astype'] == df['TIMESTAMP_view']).all() # True
Dask code:
ddf['TIMESTAMP_astype'] = ddf['TIMESTAMP'].astype('int64') # works - throws FutureWarning
ddf['TIMESTAMP_view'] = ddf['TIMESTAMP'].view('int64') # error - below
Not implemented view in dask.dataframe.core:
AttributeError: 'Series' object has no attribute 'view'
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
dataframegood second issueClearly described, educational, but less trivial than "good first issue".Clearly described, educational, but less trivial than "good first issue".