Conversation
|
Can one of the admins verify this patch? |
|
The test script might be a little bit too clever (calling itself), have you considered doing We should also change the comment that contain Other than that it looks good to me! @raulchen Can you check if there is any impact on the streaming system? |
python/ray/__init__.py
Outdated
| import ctypes | ||
| from ctypes import CDLL | ||
| CDLL(so_path, ctypes.RTLD_GLOBAL) | ||
| CDLL(so_path, ctypes.RTLD_LOCAL) |
There was a problem hiding this comment.
Should probably leave a comment here pointing to the issue this fixed
There was a problem hiding this comment.
Definitely will add!
There was a problem hiding this comment.
@ijrsvt Could you give an example which symbols conflict? We used linker script ray_exported_symbols.lds and ray_version_script.lds to limit exported symbols. And using RTLD_GLOBAL by purpose so that _streaming.so can using symbols in _raylet.so
There was a problem hiding this comment.
The conflicting symbols were in the GRPC library. The problem only showed up on Linux, and when pyarrow was imported before ray. The specific symbol that was segfaulting was: google::protobuf::internal::AssignDescriptors.
There was a problem hiding this comment.
Can you give more complete detailed log for this symbol segfault. I tink we didn't expose protobuf symbols in ray_exported_symbols.lds/ray_version_script.lds. And this import order issue exists before RTLD_GLOBAL.
There was a problem hiding this comment.
@chaokunyang Can you look more into this and why the symbols are conflicting despite the version linker script? This is blocking the current release I believe.
There was a problem hiding this comment.
@pcmoritz I'm looking into it. There may be some symbols exported unexpectedly despite of the version linker script, like template. If that's the reason, maybe we can use __attribute__((visibility(....))). If not, we'll need more time for this.
|
Test FAILed. |
|
Test FAILed. |
|
Test FAILed. |
|
Test FAILed. |
|
Test FAILed. |
|
Test PASSed. |
|
Test FAILed. |
|
Hi @ijrsvt , I used Python 3.7.5, ray 0.8.2, pyarrow 0.16.0 in a Ubuntu 19.10 docker container. I can't reproduce your issue. I also searched symbols in _raylet.so using I test ray 0.82 with pyarrow 0.16.0/0.15.0. Both of all works well. But when I use pyarrow 0.14.0, here is a segment fault. I think it's a issue of pyarrow? |
|
@chaokunyang I can't reproduce it either (and nor can the original bug reporter). I reverted the |
|
Test PASSed. |
|
@ijrsvt After merging the PR it is causing an error, can you look at this? |
| """ | ||
| import subprocess | ||
|
|
||
| TESTED_LIBRARIES = ["pyarrow"] |
There was a problem hiding this comment.
I'm mildly surprised that this test passes in our CI. Is pyarrow installed in our CI?
There was a problem hiding this comment.
This doesn't fail if pyarrow is installed, only if there is some strange symbol collision. There was initially a fix to limit the scope of exported symbols, but that was removed after it became un-reproducable.
There was a problem hiding this comment.
Right, I agree it doesn't fail if pyarrow is installed. I just thought that pyarrow isn't installed in our Travis tests.
There was a problem hiding this comment.
Ohh okay! Let me check that.
There was a problem hiding this comment.
I added Pyarrow to the dependencies. It looks like the error is a segfault again :/
https://travis-ci.com/github/ray-project/ray/jobs/308220125
There was a problem hiding this comment.
Is this pyarrow version 0.14?
There was a problem hiding this comment.
No, it is 0.16. I added the version to the debug info, so it shows up in this test.
|
|
||
|
|
||
| def test_imports(): | ||
| def try_imports(library1, library2): |
There was a problem hiding this comment.
@ijrsvt slightly simpler way to do this (and I think will give a better error message)
subprocess.check_output(["python", "-c", "import {}; import {}".format(library1, library2)])
subprocess.check_output(["python", "-c", "import {}; import {}".format(library2, library1)]) There was a problem hiding this comment.
Ahh okay. I'll switch to this!
…ay-project#7805)" This reverts commit eb61036.

Why are these changes needed?
Ray will segfault if pyarrow is imported before it because exported symbol are colliding . This fixes that bug and adds a test to ensure that future changes do not reintroduce it.
Related issue number
Fixes Issue #7393
Checks
scripts/format.shto lint the changes in this PR.