Skip to content

Update arrow and parquet-cpp.#1875

Merged
pcmoritz merged 5 commits intoray-project:masterfrom
robertnishihara:updatearrow
Apr 12, 2018
Merged

Update arrow and parquet-cpp.#1875
pcmoritz merged 5 commits intoray-project:masterfrom
robertnishihara:updatearrow

Conversation

@robertnishihara
Copy link
Copy Markdown
Collaborator

@robertnishihara robertnishihara commented Apr 11, 2018

This uses a more recent version of Arrow. The primary reason for doing this is to include

However, there are also some important Plasma client API changes from apache/arrow#1807.

Note that right now parquet doesn't work if you statically link boost (which we are doing). See the discussion in

Now that apache/parquet-cpp#452 is merged, a segfault I was seeing related to statically linking boost should be fixed.

Note that statically linking boost may not be supported in the future, so in the future we may need to dynamically link boost and bundle it with our wheels or something like that.

@robertnishihara
Copy link
Copy Markdown
Collaborator Author

Looks like I introduced a segfault somewhere in python -m pytest python/ray/tune/test/tune_server_test.py. Backtrace is

Thread 1 "python" received signal SIGSEGV, Segmentation fault.
0x00007fff856f191e in __gnu_cxx::new_allocator<plasma::UniqueID>::construct<plasma::UniqueID, plasma::UniqueID const&> (
    __p=0xffffffffffffffec, this=0x555558751bc8) at /usr/include/c++/5/ext/new_allocator.h:120
120		{ ::new((void *)__p) _Up(std::forward<_Args>(__args)...); }
(gdb) bt
#0  0x00007fff856f191e in __gnu_cxx::new_allocator<plasma::UniqueID>::construct<plasma::UniqueID, plasma::UniqueID const&>
    (__p=0xffffffffffffffec, this=0x555558751bc8) at /usr/include/c++/5/ext/new_allocator.h:120
#1  std::allocator_traits<std::allocator<plasma::UniqueID> >::construct<plasma::UniqueID, plasma::UniqueID const&> (
    __p=0xffffffffffffffec, __a=...) at /usr/include/c++/5/bits/alloc_traits.h:530
#2  std::deque<plasma::UniqueID, std::allocator<plasma::UniqueID> >::push_front (__x=..., this=0x555558751bc8)
    at /usr/include/c++/5/bits/stl_deque.h:1487
#3  plasma::PlasmaClient::Release (this=0x555558751b50, object_id=...)
    at /home/ubuntu/ray/thirdparty/build/arrow/cpp/src/plasma/client.cc:487
#4  0x00007fff856f65ff in plasma::PlasmaBuffer::~PlasmaBuffer (this=0x5555587562e0, __in_chrg=<optimized out>)
    at /home/ubuntu/ray/thirdparty/build/arrow/cpp/src/plasma/client.cc:94
#5  __gnu_cxx::new_allocator<plasma::PlasmaBuffer>::destroy<plasma::PlasmaBuffer> (this=<optimized out>, 
    __p=<optimized out>) at /usr/include/c++/5/ext/new_allocator.h:124
#6  std::allocator_traits<std::allocator<plasma::PlasmaBuffer> >::destroy<plasma::PlasmaBuffer> (__a=..., 
    __p=<optimized out>) at /usr/include/c++/5/bits/alloc_traits.h:542
#7  std::_Sp_counted_ptr_inplace<plasma::PlasmaBuffer, std::allocator<plasma::PlasmaBuffer>, (__gnu_cxx::_Lock_policy)2>::_M_dispose (this=0x5555587562d0) at /usr/include/c++/5/bits/shared_ptr_base.h:531
#8  0x00007fff859557e0 in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (this=0x5555587562d0)
    at /usr/include/c++/5/bits/shared_ptr_base.h:150
#9  0x00007fff85954a47 in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count (this=0x55555877d478, 
    __in_chrg=<optimized out>) at /usr/include/c++/5/bits/shared_ptr_base.h:659
#10 0x00007fff859547ac in std::__shared_ptr<arrow::Buffer, (__gnu_cxx::_Lock_policy)2>::~__shared_ptr (
    this=0x55555877d470, __in_chrg=<optimized out>) at /usr/include/c++/5/bits/shared_ptr_base.h:925
#11 0x00007fff859547e4 in std::shared_ptr<arrow::Buffer>::~shared_ptr (this=0x55555877d470, __in_chrg=<optimized out>)
    at /usr/include/c++/5/bits/shared_ptr.h:93
#12 0x00007fff859548aa in arrow::Buffer::~Buffer (this=0x55555877d440, __in_chrg=<optimized out>)
    at /home/ubuntu/ray/thirdparty/pkg/arrow/cpp/build/cpp-install/include/arrow/buffer.h:73
#13 0x00007ffff221e2a2 in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (this=0x55555877d430)
    at /usr/include/c++/5/bits/shared_ptr_base.h:150
#14 0x00007ffff221a155 in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count (this=0x7fff765dfb08, 
    __in_chrg=<optimized out>) at /usr/include/c++/5/bits/shared_ptr_base.h:659
---Type <return> to continue, or q <return> to quit---
#15 0x00007ffff2216c20 in std::__shared_ptr<arrow::Buffer, (__gnu_cxx::_Lock_policy)2>::~__shared_ptr (
    this=0x7fff765dfb00, __in_chrg=<optimized out>) at /usr/include/c++/5/bits/shared_ptr_base.h:925
#16 0x00007ffff2216c58 in std::shared_ptr<arrow::Buffer>::~shared_ptr (this=0x7fff765dfb00, __in_chrg=<optimized out>)
    at /usr/include/c++/5/bits/shared_ptr.h:93
#17 0x00007ffff221dfbf in __Pyx_call_destructor<std::shared_ptr<arrow::Buffer> > (x=...)
    at /home/ubuntu/ray/thirdparty/build/arrow/python/build/temp.linux-x86_64-3.6/lib.cxx:486
#18 0x00007ffff21c6d15 in __pyx_tp_dealloc_7pyarrow_3lib_Buffer (o=0x7fff765dfae8)
    at /home/ubuntu/ray/thirdparty/build/arrow/python/build/temp.linux-x86_64-3.6/lib.cxx:107704
#19 0x00007ffff0b5eeff in ?? ()
   from /home/ubuntu/miniconda3/lib/python3.6/site-packages/numpy/core/multiarray.cpython-36m-x86_64-linux-gnu.so
#20 0x00005555556430da in tupledealloc ()
#21 0x0000555555642b1f in list_dealloc ()
#22 0x00005555556bbace in collect ()
#23 0x000055555574785a in _PyGC_CollectNoFail ()
#24 0x00005555556fcbb3 in PyImport_Cleanup ()
#25 0x0000555555761ce1 in Py_FinalizeEx ()
#26 0x0000555555761e49 in Py_Exit ()
#27 0x0000555555761f38 in handle_system_exit ()
#28 0x0000555555761fa2 in PyErr_PrintEx ()
#29 0x0000555555762205 in RunModule ()
#30 0x000055555576c918 in Py_Main ()
#31 0x000055555563471e in main ()

@AmplabJenkins
Copy link
Copy Markdown

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/4802/
Test PASSed.

@AmplabJenkins
Copy link
Copy Markdown

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/4823/
Test FAILed.

@AmplabJenkins
Copy link
Copy Markdown

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/4831/
Test FAILed.

@robertnishihara
Copy link
Copy Markdown
Collaborator Author

retest this please

Copy link
Copy Markdown
Contributor

@pcmoritz pcmoritz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@AmplabJenkins
Copy link
Copy Markdown

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/4838/
Test PASSed.

@pcmoritz pcmoritz merged commit d0fffec into ray-project:master Apr 12, 2018
@pcmoritz pcmoritz deleted the updatearrow branch April 12, 2018 23:17
royf added a commit to royf/ray that referenced this pull request Apr 22, 2018
* master: (56 commits)
  [xray] Turn on flushing to the GCS for the lineage cache (ray-project#1907)
  Single Big Object Parallel Transfer. (ray-project#1827)
  Remove num_threads as a parameter. (ray-project#1891)
  Adds Valgrind tests for multi-threaded object manager. (ray-project#1890)
  Pin cython version in docker base dependencies file. (ray-project#1898)
  Update arrow to efficiently serialize more types of numpy arrays. (ray-project#1889)
  updates (ray-project#1896)
  [DataFrame] Inherit documentation from Pandas (ray-project#1727)
  Update arrow and parquet-cpp. (ray-project#1875)
  raylet command line resource configuration plumbing (ray-project#1882)
  use raylet for remote ray nodes (ray-project#1880)
  [rllib] Propagate dim option to deepmind wrappers (ray-project#1876)
  [RLLib] DDPG (ray-project#1685)
  Lint Python files with Yapf (ray-project#1872)
  [DataFrame] Fixed repr, info, and memory_usage (ray-project#1874)
  Fix getattr compat (ray-project#1871)
  check if arrow build dir exists (ray-project#1863)
  [DataFrame] Encapsulate index and lengths into separate class (ray-project#1849)
  [DataFrame] Implemented __getattr__ (ray-project#1753)
  Add better analytics to docs (ray-project#1854)
  ...

# Conflicts:
#	python/ray/rllib/__init__.py
#	python/setup.py
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants