-
-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Closed
Labels
dataframegood first issueClearly described and easy to accomplish. Good for beginners to the project.Clearly described and easy to accomplish. Good for beginners to the project.
Description
I'm hoping to get an idea of the memory size of a dask.dataframe once I call .compute() on it
My current approach is
import dask.dataframe as dd
from dask.utils import format_bytes
ddf = dd.demo.make_timeseries(
start="2000-01-01",
end="2000-01-02",
dtypes={"x": float, "y": float, "id": int},
freq="10ms",
partition_freq="24h",
)
format_bytes(ddf.memory_usage(deep=True).sum().compute())
pandas has .info which gives the this as well as other info (and returns a NoneType)
I see dask.dataframe has an info() method but not sure if it works as expected:
ddf.info()
ddf.info.compute()
AttributeError: 'NoneType' object has no attribute 'compute'
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
dataframegood first issueClearly described and easy to accomplish. Good for beginners to the project.Clearly described and easy to accomplish. Good for beginners to the project.

