-
Notifications
You must be signed in to change notification settings - Fork 4k
ARROW-2300: [C++/Python] Integration test for HDFS #1889
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@kszucs this is odd, it looks like there might be a conflict with the new Anaconda 5 compilers? Can you rebase? |
Codecov Report
@@ Coverage Diff @@
## master #1889 +/- ##
==========================================
- Coverage 84.39% 84.38% -0.02%
==========================================
Files 293 293
Lines 44789 44786 -3
==========================================
- Hits 37799 37791 -8
- Misses 6963 6964 +1
- Partials 27 31 +4
Continue to review full report at Codecov.
|
|
@wesm I could use some hints: With libhdfs3 it still segfaults when setting In the ListDirectory test case the first iteration goes well, but afterwards There must be something obvious I can't find. Maybe this bad_alloc handling causes it? |
|
Oof, sorry that you've run into this. libhdfs3 2.3.x. on conda-forge introduced an ABI incompatibility, see https://issues.apache.org/jira/browse/ARROW-1465 In the meantime, you can pin to version 2.2.31 to make the problem go away |
|
The discussion of the underlying problem is in https://issues.apache.org/jira/browse/ARROW-1445 |
|
Hmm, sooo these are the symptoms of ABI incompatibility. Pinning it, thanks! |
|
Another error happens, but this one seems fixable. I'll investigate it tomorrow. |
|
|
|
Thank you @kszucs. I will take this for a spin locally and then merge |
|
Checked this out and got this error: |
|
Something must went wrong during checkout, because this no longer uses the Impala image. |
|
Ok, let me nuke my branch and start over |
|
If I recall correctly a couple of tests are failing, but the setup works properly. |
|
|
I've fixed the python errors, but the previously pasted |
|
@wesm Should We wrap hdfsListDirectory with a closure or store the driver/adapter type in if (result == nullptr && errno == 2 && this->hdfsExists(fs, path)) {
// if libhdfs reports errno 2: no such file or directory
// however the path exists, we set errno to zero
errno = 0;
}It's a super hacky solution, I don't like it, but can't think of an alternative. Perhaps We should pin a minimal |
|
Let me have a closer look today -- I thought I had run this with Hadoop 2.6 before and didn't hit these issues so |
|
@kszucs do you see this? https://github.com/kszucs/arrow/blob/ARROW-2300/python/testing/hdfs/Dockerfile#L19 The scope of the JIRA issue was to fix |
|
Ohh sorry @wesm ! I've ported the hdfs integration tests from Please use the following command: |
|
Sweet thanks =) |
|
After We have all of the integration tests under the same roof (dask, spark, hdfs) We can simply just submit them as crossbow tasks (similarly like the build tasks, but defined in a separate |
…unning hdfs tests Change-Id: I8e2908c9ad2d81596f427858f2af7e2d151bfb1c
Change-Id: I44f57bc5b3ea28966e1562e404bdce65afe0cfab
Change-Id: I1f9879ab7eb150f38c66d29cfe8b41792a7b5cf8
Change-Id: I5290d60c15d51271c51df7565eb5fb1cadd4ff5e
|
@kszucs I applied your workaround for the errno 2 thing. I fiddled with this a little bit (including setting +1. I will merge this once the build completes to make sure I didn't mess up the lint |
|
thanks @kszucs! |
No description provided.