-
Notifications
You must be signed in to change notification settings - Fork 706
Fork arrow2 and get rid of polars #4789
Copy link
Copy link
Closed
Labels
blockedcan't make progress right nowcan't make progress right nowdependenciesconcerning crates, pip packages etcconcerning crates, pip packages etcenhancementNew feature or requestNew feature or request🏹 arrowApache ArrowApache Arrow📉 performanceOptimization, memory use, etcOptimization, memory use, etc
Milestone
Metadata
Metadata
Assignees
Labels
blockedcan't make progress right nowcan't make progress right nowdependenciesconcerning crates, pip packages etcconcerning crates, pip packages etcenhancementNew feature or requestNew feature or request🏹 arrowApache ArrowApache Arrow📉 performanceOptimization, memory use, etcOptimization, memory use, etc
While we want to migrate from the
arrow2crate toarrow(#3741), it is a big task that we would rather punt on right now. It is technical debt, but the debt is not going to grow significantly. The gains don't justify the potential rabbit hole of paint it could turn into.One of the major reasons to migrate away from
arrow2is becauseDataTypehas a huge overhead, especially when cloned.We have a PR to fix it (jorgecarleitao/arrow2#1469) but it is unmerged, because
arrow2in unmaintained.So: we fork
arrow2asre_arrow2, merge our PR, and solve our immediate memory issue.Since
polarsrequirearrow2, we need to stop using it. We only have it for a few tests.We should revisit the migrating away from arrow2 when we start exposing arrow things to the users (e.g. support data queries in the SDK) and/or when we want to interface with a some data frame crate.