Yeah the thing that struck me with Vortex when I first looked into it was that you have to move it to a Pyarrow Dataset to use it with a query engine. You can read it directly using the Vortex library but basically it became just Arrow for me to read it with something else. Maybe it's changed, I need to dive into it more.
Is datafusion still a thing at this point?
Datafusion is starting to have a little renaissance in new db tools.
It's the underlying engine for LakeSail which is making decent progress as a Rust-based Spark successor: https://lakesail.com/
Yeah the thing that struck me with Vortex when I first looked into it was that you have to move it to a Pyarrow Dataset to use it with a query engine. You can read it directly using the Vortex library but basically it became just Arrow for me to read it with something else. Maybe it's changed, I need to dive into it more.