Skip to content

Conversation

@fjetter
Copy link
Contributor

@fjetter fjetter commented Sep 3, 2017

This way, the ParquetDataset accepts both S3FileSystem and LocalFileSystem objects as they are used in dask. By using issubclass, external libraries may write their own FS wrappers by inheriting from the arrow FS.

I tested the integration with dask and this will fix the issue blocking dask/dask#2527

@wesm
Copy link
Member

wesm commented Sep 3, 2017

Great! I opened https://issues.apache.org/jira/browse/ARROW-1455 so we can set up some kind of testing arrangement so we can validate that we don't break our integrations with Dask before Arrow and Dask releases -- this would be a trunk vs trunk test so I'm not comfortable adding it directly to Arrow's CI in case there is transient instability that might cause broken builds. If we get really energetic we could set up an automated build to report on the state of the integration tests

@wesm
Copy link
Member

wesm commented Sep 3, 2017

I'll make sure the S3 tests pass locally before merging this (using the --s3 flag). We ought to run these tests in CI; a bit of work to do this though: https://issues.apache.org/jira/browse/ARROW-1456

@fjetter
Copy link
Contributor Author

fjetter commented Sep 3, 2017

Good idea to add these tests. Although, I would prefer to have them run automatically in the CI. We could add it to travis and allow build failures for these tests only, c.f. https://docs.travis-ci.com/user/customizing-the-build#Rows-that-are-Allowed-to-Fail

@wesm
Copy link
Member

wesm commented Sep 3, 2017

If we can add them to Travis and it does not inflate build times significantly then I am all for it. Ideally we would run trunk-to-trunk. We are doing that with parquet-cpp, which does introduce occasional brittleness (but it's definitely been worth it)

@wesm
Copy link
Member

wesm commented Sep 4, 2017

Rebased

Copy link
Member

@wesm wesm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1. I verified that the S3 tests still pass. Will merge once the builds complete

@asfgit asfgit closed this in ec32013 Sep 4, 2017
@wesm wesm deleted the ARROW-1417 branch September 4, 2017 23:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants