-
-
Notifications
You must be signed in to change notification settings - Fork 48
Closed
Milestone
Description
When executing pipeline with analyze()->withColumnStatistics() each column stats holds reference to the first entry from the dataframe.
This is problematic since entries might change over time, for example one row might be null and the second one might be date_time. In this case the column statistics will hold a reference to nullable StringEntry with metadata FROM_NULL
Entry reference should be removed, they do not serve any purpose there.