Skip to content

CI: test pyarrow 2.0 and nightly#7030

Merged
jsignell merged 9 commits intodask:masterfrom
jorisvandenbossche:ci-pyarrow
Jan 6, 2021
Merged

CI: test pyarrow 2.0 and nightly#7030
jsignell merged 9 commits intodask:masterfrom
jorisvandenbossche:ci-pyarrow

Conversation

@jorisvandenbossche
Copy link
Member

@jorisvandenbossche jorisvandenbossche commented Jan 5, 2021

Alternative to #6936 (cc @jrbourbeau): since there is an upstream-dev build, we can actually install pyarrow nightly in there (instead of only doing it in a test PR as #6936 does).
Additionally, I also removed the !=2.0.0 pin (which will require some skips in the tests (TODO), but I think it's still worth the run the rest of the tests with the latest released pyarrow version).

build:
runs-on: "ubuntu-latest"
if: "contains(github.event.head_commit.message, 'test-upstream') || (github.repository == 'dask/dask' && github.event_name != 'pull_request')"
# if: "contains(github.event.head_commit.message, 'test-upstream') || (github.repository == 'dask/dask' && github.event_name != 'pull_request')"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: undo this change at the end

runs-on: ${{ matrix.os }}
strategy:
fail-fast: true
fail-fast: false
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: undo this change at the end

- tiledb-py
- sqlalchemy
- pyarrow>=0.14.0,!=2.0.0
- pyarrow=0.14.*
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pinned this to 0.14, because that is what in practice for all platforms gets used (and by putting it in the env file, it's a bit easier to see this)

- fsspec>=0.6.0
- sqlalchemy
- pyarrow>=0.14.0,!=2.0.0
- pyarrow=1.0.*
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here, this is the version that got used in practice

# For the dependencies. These will be re-installed later.
- numpy
- pandas
- pandas=1.0.*
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pinned here to older pandas, since otherwise I get the failures that are being handled by #6996

- fsspec>=0.6.0
- sqlalchemy
- pyarrow>=0.14.0,!=2.0.0
- pyarrow>=0.14.0
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here just leave the "larger than" constraint, to get the last released version of pyarrow, which at the moment is 2.0.0 (after removing the !=2.0.0 pin)

setup.cfg Outdated
# error:::dask[.*]
# error:::pandas[.*]
# error:::numpy[.*]
# error:::distributed[.*]]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: undo this change in the end

@jorisvandenbossche
Copy link
Member Author

cc @jsignell apart from the graphviz edit, and some edits I need to do undo at the end (if all is OK otherwise), I think this should be ready

In the upstream build, remaining failure is a sparse related one

@jsignell
Copy link
Member

jsignell commented Jan 6, 2021

Great!

In the upstream build, remaining failure is a sparse related one

Yeah I think I'd done some fixes around that one in the past. Let me open a fresh PR for that. rebase #6896

@jorisvandenbossche
Copy link
Member Author

OK, everything seems to be passing here (except the 1 sparse failure), will revert the temporary edits

@jsignell jsignell merged commit 310072c into dask:master Jan 6, 2021
Comment on lines +10 to +14
# conda uninstall --force numpy pandas
# python -m pip install --no-deps --pre \
# -i https://pypi.anaconda.org/scipy-wheels-nightly/simple \
# numpy
# python -m pip install --pre pandas==1.1.0rc0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies for not catching this earlier, but I think we want to keep this for testing nightly versions of numpy and pandas

Though I just noticed we're pinning pandas==1.1.0rc0 which might be an oversight. Instead I would have expected us to install both numpy and pandas from https://pypi.anaconda.org/scipy-wheels-nightly/simple

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we certainly want to keep that, but the thing is that they were currently failing a lot, and thus not very useful (hard to see what errors still need to be fixed for a given dependency). The pandas ones are being fixed in #6996 (and the pandas nightly install is added back there), and the numpy ones in #6896 I think

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! Thanks for clarifying

@jorisvandenbossche jorisvandenbossche deleted the ci-pyarrow branch January 6, 2021 20:49
abduhbm pushed a commit to abduhbm/dask that referenced this pull request Jan 19, 2021
* CI: test pyarrow 2.0 and nightly

* temp disable warning->error

* try to force install pyarrow nightly

* try conda update

* temp pin graphviz

* add skip for pyarrow test with pyarrow==2.0.0 on windows

* also skip pyarrow-dataset engine on windows

* clean-up / undo temp edits
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants