Skip to content

Listing all optional requirements of Dask#3456

Closed
jakirkham wants to merge 1 commit intodask:masterfrom
jakirkham:req_all
Closed

Listing all optional requirements of Dask#3456
jakirkham wants to merge 1 commit intodask:masterfrom
jakirkham:req_all

Conversation

@jakirkham
Copy link
Member

Includes all requirements of Dask without version constraints to start. Would like to fill these in as well. This should help us better understand what Dask needs for a complete working environment.

Note: Some things like setuptools are needed at build time (i.e. to install dask) or versioneer for updating the vendored versioning code, but are not needed outside of that. So they have been stripped.

Includes all requirements of Dask without version constraints to start.
Would like to fill these in as well. This should help us better
understand what Dask needs for a complete working environment.
@jakirkham
Copy link
Member Author

cc @mrocklin

@mrocklin
Copy link
Member

mrocklin commented May 2, 2018

Thanks for doing this!

Can I suggest that we rename this something like requirements-optional.txt. I feel like listing hard requirements in requirements.txt is enough of a pattern that people might incorrectly assume that sqlalchemy, tornado, etc. are hard requirements.

@jakirkham
Copy link
Member Author

Sure. Had also thought about requirements-all.txt and requirements-full.txt. No strong preference on any of these.

@mrocklin
Copy link
Member

mrocklin commented May 2, 2018

I like requirements-optional because I think it is the most clear that users don't need to install them. I'm mostly optimizing here to reduce the number of issues we receive from confused users.

bcolz
h5py
matplotlib
moto
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to include testing dependencies like moto?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Debated it, but included them initially as that was easier. Wasn't obvious to me where to draw the line. Please feel free to advise.

partd
pytest
requests
s3fs
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Scipy and gcsfs are missing?

@mrocklin mrocklin mentioned this pull request May 2, 2018
@mrocklin mrocklin changed the title Listing of all requirements of Dask Listing all optional requirements of Dask May 2, 2018
moto
pandas_datareader
partd
pytest
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also what about optional dependencies of other projects that users might run into, pytables for dask.dataframe (implicitly depended on from pandas)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally don't want to go down this road. It's too hard to tell what those might be and this goes far afield fast.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that it becomes a slippery slope, however I also suspect that pytables is more likely to be an issue for users than is scipy.

If the goal is to avoid packaging issues then it probably makes sense to also raise issues throughout the community? It seems odd to be the only ones doing this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I disagree. As we are not using pytables directly, it should not be a dependency. If we are concerned about how dask affects pytables, we should consider testing against it. At that point it becomes a testing dependency and can be listed for that reason.

To put this more generally, the reasoning should not be users may want to use this with Dask so it should be a dependency. Instead the reasoning should be, we aim to be compatible with this useful library through our testing and signify this already present commitment by listing it as a testing dependency. IOW dependency comes after usage not before.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the goal is to avoid packaging issues then it probably makes sense to also raise issues throughout the community? It seems odd to be the only ones doing this.

This happens often through engagement with packaging ecosystems like conda-forge and Linux distros. We also encourage projects to package licenses, changelogs, and point out other issues in a libraries metadata or build systems. In many cases CMakeLists.txt have been written for many projects by Windows users'/maintainers' in conda-forge seeking Windows support in libraries that had none. IOW this sort of stuff does happen.

@@ -0,0 +1,15 @@
bcolz
h5py
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also needs graphviz here.

@jakirkham
Copy link
Member Author

Closing as I don't plan to work on this at present.

@jakirkham jakirkham closed this Jun 12, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants