Package some form of "minimal" libarrow?

I just saw that pandas is [considering](https://github.com/pandas-dev/pandas/pull/52711) to hard-depend on pyarrow. This would substantially blow up the footprint of all pandas installs (which is pandas' choice to make), but even more so on conda-forge, because we package more bindings than the wheels[^1].

This is not the first time that this topic has come up, c.f. below:

[^1]: given the difficulty of building arrow, they are effectively uninstallable otherwise, hence the maximalist approach to packaging on this feedstock, resp. in conda-forge in general 

> > @h-vetinari: That said, there's a larger theme here that arrow keeps growing non-trivial dependencies. I guess we could introduce a separate output for a "minimal" arrow (`libarrow-core`?)
>
> @jorisvandenbossche: That's indeed something we have to look at long term, but probably good for a separate issue? (it seems to me that regardless of that, cudatoolkit should never be a dependency for the CPU version?) 
> libarrow itself actually already exists of multiple shared libraries (libarrow, libarrow_dataset, libarrow_flight, etc). So that could be a first way to split it into multiple packages that should be simpler (eg libarrow_flight has some extra dependencies that are not needed for libarrow itself, such as UCX). On the Arrow C++ side itself, we are also working on further splitting the core libarrow in more shared libraries.

_Originally posted by @jorisvandenbossche in https://github.com/conda-forge/arrow-cpp-feedstock/issues/962#issuecomment-1432924187_
            

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Package some form of "minimal" libarrow? #1035

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Package some form of "minimal" libarrow? #1035

Description

Footnotes

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions