Skip to content

[Bug] be crash in doris::vectorized::ColumnString::insert_from #28584

@trikker

Description

@trikker

Search before asking

  • I had searched in the issues and found no similar issues.

Version

2.0.2

What's Wrong?

I am importing large amount of data(about 1TB) into doris (doris cluster has only one fe and one be) using insert ... values and doris be crashed with the following error and stack (found from be.out) and it always crashes again with the same error and crash stack after I restart it.

start time: Mon Dec 18 19:32:51 CST 2023
INFO: java_cmd /home/olap/java/jdk1.8.0_152/bin/java
INFO: jdk_version 8
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/olap/app/apache-doris-2.0.2-bin-x64/be/lib/java_extensions/preload-extensions/preload-extensions-jar-with-dependencies.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/olap/app/apache-doris-2.0.2-bin-x64/be/lib/java_extensions/java-udf/java-udf-jar-with-dependencies.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/olap/app/apache-doris-2.0.2-bin-x64/be/lib/hadoop_hdfs/common/lib/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Reload4jLoggerFactory]
Java HotSpot(TM) 64-Bit Server VM warning: You have loaded library /home/olap/app/apache-doris-2.0.2-bin-x64/be/lib/hadoop_hdfs/native/libhadoop.so.1.0.0 which might have disabled stack guard. The VM will try to fix the stack guard now.
It's highly recommended that you fix the library with 'execstack -c <libfile>', or link it with '-z noexecstack'.
I0000 00:00:00.000000 881476 vlog_is_on.cc:197] RAW: Set VLOG level for "*" to 10
terminate called after throwing an instance of 'doris::Exception'
  what():  [E-3113] string column length is too large: total_length=4295295075, element_number=3642
0. /root/src/doris-2.0/be/src/common/stack_trace.cpp:302: StackTrace::tryCapture() @ 0x000000000ba1f197 in /home/olap/app/apache-doris-2.0.2-bin-x64/be/lib/doris_be
1. /root/src/doris-2.0/be/src/common/stack_trace.h:0: doris::get_stack_trace[abi:cxx11]() @ 0x000000000ba1d72d in /home/olap/app/apache-doris-2.0.2-bin-x64/be/lib/doris_be
2. /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/unique_ptr.h:173: doris::Exception::Exception(int, std::basic_string_view<char, std::char_traits<char> >) @ 0x000000000b4b996e in /home/olap/app/apache-doris-2.0.2-bin-x64/be/lib/doris_be
3. /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/basic_string.h:187: doris::Exception::Exception<unsigned long&, unsigned long&>(int, std::basic_string_view<char, std::char_traits<char> >, unsigned long&, unsigned long&) @ 0x000000000af7b858 in /home/olap/app/apache-doris-2.0.2-bin-x64/be/lib/doris_be
4. /root/src/doris-2.0/be/src/vec/columns/column_string.h:71: doris::vectorized::ColumnString::insert_from(doris::vectorized::IColumn const&, unsigned long) @ 0x000000000cd15099 in /home/olap/app/apache-doris-2.0.2-bin-x64/be/lib/doris_be
5. /root/src/doris-2.0/be/src/vec/olap/vertical_block_reader.cpp:527: doris::vectorized::VerticalBlockReader::_unique_key_next_block(doris::vectorized::Block*, bool*) @ 0x00000000125b9aa9 in /home/olap/app/apache-doris-2.0.2-bin-x64/be/lib/doris_be
6. /root/src/doris-2.0/be/src/common/status.h:432: doris::Merger::vertical_compact_one_group(std::shared_ptr<doris::Tablet>, doris::ReaderType, std::shared_ptr<doris::TabletSchema>, bool, std::vector<unsigned int, std::allocator<unsigned int> > const&, doris::vectorized::RowSourcesBuffer*, std::vector<std::shared_ptr<doris::RowsetReader>, std::allocator<std::shared_ptr<doris::RowsetReader> > > const&, doris::RowsetWriter*, long, doris::Merger::Statistics*) @ 0x000000000af4ea7c in /home/olap/app/apache-doris-2.0.2-bin-x64/be/lib/doris_be
7. /root/src/doris-2.0/be/src/olap/merger.cpp:351: doris::Merger::vertical_merge_rowsets(std::shared_ptr<doris::Tablet>, doris::ReaderType, std::shared_ptr<doris::TabletSchema>, std::vector<std::shared_ptr<doris::RowsetReader>, std::allocator<std::shared_ptr<doris::RowsetReader> > > const&, doris::RowsetWriter*, long, doris::Merger::Statistics*) @ 0x000000000af50096 in /home/olap/app/apache-doris-2.0.2-bin-x64/be/lib/doris_be
8. /root/src/doris-2.0/be/src/common/status.h:348: doris::Compaction::do_compaction_impl(long) @ 0x000000000af3f21a in /home/olap/app/apache-doris-2.0.2-bin-x64/be/lib/doris_be
9. /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr_base.h:1291: doris::Compaction::do_compaction(long) @ 0x000000000af3e66a in /home/olap/app/apache-doris-2.0.2-bin-x64/be/lib/doris_be
10. /root/src/doris-2.0/be/src/common/status.h:432: doris::BaseCompaction::execute_compact_impl() @ 0x000000000b6d2a3c in /home/olap/app/apache-doris-2.0.2-bin-x64/be/lib/doris_be
11. /root/src/doris-2.0/be/src/common/status.h:432: doris::Compaction::execute_compact() @ 0x000000000af3e343 in /home/olap/app/apache-doris-2.0.2-bin-x64/be/lib/doris_be
12. /root/src/doris-2.0/be/src/common/status.h:432: doris::Tablet::execute_compaction(doris::CompactionType) @ 0x000000000b710f8d in /home/olap/app/apache-doris-2.0.2-bin-x64/be/lib/doris_be
13. /root/src/doris-2.0/be/src/olap/olap_server.cpp:963: std::_Function_handler<void (), doris::StorageEngine::_submit_compaction_task(std::shared_ptr<doris::Tablet>, doris::CompactionType, bool)::$_0>::_M_invoke(std::_Any_data const&) @ 0x000000000aedf91e in /home/olap/app/apache-doris-2.0.2-bin-x64/be/lib/doris_be
14. /root/src/doris-2.0/be/src/util/threadpool.cpp:0: doris::ThreadPool::dispatch_thread() @ 0x000000000ba5bdaf in /home/olap/app/apache-doris-2.0.2-bin-x64/be/lib/doris_be
15. /var/local/ldb-toolchain/bin/../usr/include/pthread.h:562: doris::Thread::supervise_thread(void*) @ 0x000000000ba51d3c in /home/olap/app/apache-doris-2.0.2-bin-x64/be/lib/doris_be
16. start_thread @ 0x0000000000007ea5 in /usr/lib64/libpthread-2.17.so
17. __clone @ 0x00000000000feb0d in /usr/lib64/libc-2.17.so

*** Query id: 0-0 ***
*** tablet id: 34414 ***
*** Aborted at 1702899443 (unix time) try "date -d @1702899443" if you are using GNU date ***
*** Current BE git commitID: ae923f7 ***
*** SIGABRT unknown detail explain (@0x5168000d7344) received by PID 881476 (TID 882483 OR 0x7f06899f2700) from PID 881476; stack trace: ***
 0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*) at /root/src/doris-2.0/be/src/common/signal_handler.h:417
 1# 0x00007F08BDDB6400 in /lib64/libc.so.6
 2# gsignal in /lib64/libc.so.6
 3# abort in /lib64/libc.so.6
 4# __gnu_cxx::__verbose_terminate_handler() [clone .cold] at ../../../../libstdc++-v3/libsupc++/vterminate.cc:75
 5# __cxxabiv1::__terminate(void (*)()) at ../../../../libstdc++-v3/libsupc++/eh_terminate.cc:48
 6# 0x000055E00F59B0B1 in /home/olap/app/doris/be/lib/doris_be
 7# 0x000055E00F59B204 in /home/olap/app/doris/be/lib/doris_be
 8# doris::vectorized::ColumnString::insert_from(doris::vectorized::IColumn const&, unsigned long) at /root/src/doris-2.0/be/src/vec/columns/column_string.h:173
 9# doris::vectorized::VerticalBlockReader::_unique_key_next_block(doris::vectorized::Block*, bool*) at /root/src/doris-2.0/be/src/vec/olap/vertical_block_reader.cpp:527
10# doris::Merger::vertical_compact_one_group(std::shared_ptr<doris::Tablet>, doris::ReaderType, std::shared_ptr<doris::TabletSchema>, bool, std::vector<unsigned int, std::allocator<unsigned int> > const&, doris::vectorized::RowSourcesBuffer*, std::vector<std::shared_ptr<doris::RowsetReader>, std::allocator<std::shared_ptr<doris::RowsetReader> > > const&, doris::RowsetWriter*, long, doris::Merger::Statistics*) at /root/src/doris-2.0/be/src/olap/merger.cpp:244
11# doris::Merger::vertical_merge_rowsets(std::shared_ptr<doris::Tablet>, doris::ReaderType, std::shared_ptr<doris::TabletSchema>, std::vector<std::shared_ptr<doris::RowsetReader>, std::allocator<std::shared_ptr<doris::RowsetReader> > > const&, doris::RowsetWriter*, long, doris::Merger::Statistics*) at /root/src/doris-2.0/be/src/olap/merger.cpp:351
12# doris::Compaction::do_compaction_impl(long) at /root/src/doris-2.0/be/src/olap/compaction.cpp:351
13# doris::Compaction::do_compaction(long) at /root/src/doris-2.0/be/src/olap/compaction.cpp:124
14# doris::BaseCompaction::execute_compact_impl() at /root/src/doris-2.0/be/src/olap/base_compaction.cpp:87
15# doris::Compaction::execute_compact() at /root/src/doris-2.0/be/src/olap/compaction.cpp:106
16# doris::Tablet::execute_compaction(doris::CompactionType) at /root/src/doris-2.0/be/src/olap/tablet.cpp:1883
17# std::_Function_handler<void (), doris::StorageEngine::_submit_compaction_task(std::shared_ptr<doris::Tablet>, doris::CompactionType, bool)::$_0>::_M_invoke(std::_Any_data const&) at /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/std_function.h:291
18# doris::ThreadPool::dispatch_thread() in /home/olap/app/doris/be/lib/doris_be
19# doris::Thread::supervise_thread(void*) at /root/src/doris-2.0/be/src/util/thread.cpp:470
20# start_thread in /lib64/libpthread.so.0
21# clone in /lib64/libc.so.6

What You Expected?

be should not crash.

How to Reproduce?

Becasue the data I am loading is too large I cannot reproduce it with the exact insert statement. I think it has nothing to do with insert statement, it should be the backend issue. But I have the coredump file.

Anything Else?

(1) be.out file (renamed to be.out.log)

be.out.log

(2) I found a similar github issue which is already fixed and I am not sure whther they are of the same root casue.
#20698 fix fix bug of columnstring prefetch

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions