Skip to content

RocksDB occasionally crashes with _ZN7rocksdb6DBImpl11NewIteratorERKNS_11ReadOptionsEPNS_18ColumnFamilyHandleE+0x14f #3024

@dlg99

Description

@dlg99

BUG REPORT

Describe the bug

RocksDB occasionally (?) crashes with _ZN7rocksdb6DBImpl11NewIteratorERKNS_11ReadOptionsEPNS_18ColumnFamilyHandleE+0x14f

I noticed that tests got flakier on CI, tried to repro, tests succeeded on all local runs but I noticed the error log.

To Reproduce

No idea. Ran the tests for the bookkeeper-server, tests do retries.
When I checked later the test run reported success but I noticed the error dump (attached)

Expected behavior

No crash

Additional context

..
# Problematic frame:
# C  [librocksdbjni13858492392843593377.jnilib+0xd8fdf]  _ZN7rocksdb6DBImpl11NewIteratorERKNS_11ReadOptionsEPNS_18ColumnFamilyHandleE+0x14f
..
Current thread (0x00007fc0d1cc0000):  JavaThread "GarbageCollectorThread-328-1" [_thread_in_native, id=58223, stack(0x0000700012b9e000,0x0000700012c9e000)]

Stack: [0x0000700012b9e000,0x0000700012c9e000],  sp=0x0000700012c9d150,  free space=1020k
Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code)
C  [librocksdbjni13858492392843593377.jnilib+0xd8fdf]  _ZN7rocksdb6DBImpl11NewIteratorERKNS_11ReadOptionsEPNS_18ColumnFamilyHandleE+0x14f
C  [librocksdbjni13858492392843593377.jnilib+0x2421d]  Java_org_rocksdb_RocksDB_iterator__JJ+0xbd
j  org.rocksdb.RocksDB.iterator(JJ)J+0
j  org.rocksdb.RocksDB.newIterator(Lorg/rocksdb/ReadOptions;)Lorg/rocksdb/RocksIterator;+14
j  org.apache.bookkeeper.bookie.storage.ldb.KeyValueStorageRocksDB.iterator()Lorg/apache/bookkeeper/bookie/storage/ldb/KeyValueStorage$CloseableIterator;+8
j  org.apache.bookkeeper.bookie.storage.ldb.PersistentEntryLogMetadataMap.forEach(Ljava/util/function/BiConsumer;)V+4
j  org.apache.bookkeeper.bookie.GarbageCollectorThread.doGcEntryLogs()V+20
j  org.apache.bookkeeper.bookie.GarbageCollectorThread.runWithFlags(ZZZ)V+35
j  org.apache.bookkeeper.bookie.GarbageCollectorThread.safeRun()V+28
J 5952 c2 org.apache.bookkeeper.common.util.SafeRunnable.run()V (22 bytes) @ 0x000000011fc0ab7c [0x000000011fc0ab40+0x000000000000003c]
J 4184 c1 java.util.concurrent.Executors$RunnableAdapter.call()Ljava/lang/Object; java.base@11.0.11 (14 bytes) @ 0x0000000118c3ccd4 [0x0000000118c3cbc0+0x0000000000000114]
J 5726 c1 java.util.concurrent.FutureTask.runAndReset()Z java.base@11.0.11 (125 bytes) @ 0x00000001190bbb4c [0x00000001190bb480+0x00000000000006cc]
J 4362 c1 java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run()V java.base@11.0.11 (57 bytes) @ 0x0000000118cbd824 [0x0000000118cbd640+0x00000000000001e4]
J 4459 c1 java.util.concurrent.ThreadPoolExecutor.runWorker(Ljava/util/concurrent/ThreadPoolExecutor$Worker;)V java.base@11.0.11 (187 bytes) @ 0x0000000118d06d44 [0x0000000118d05ee0+0x0000000000000e64]
J 5300 c1 java.util.concurrent.ThreadPoolExecutor$Worker.run()V java.base@11.0.11 (9 bytes) @ 0x0000000118fc1fc4 [0x0000000118fc1f40+0x0000000000000084]
J 5042 c1 io.netty.util.concurrent.FastThreadLocalRunnable.run()V (22 bytes) @ 0x0000000118efba6c [0x0000000118efb960+0x000000000000010c]
J 4002 c1 java.lang.Thread.run()V java.base@11.0.11 (17 bytes) @ 0x0000000118bd7184 [0x0000000118bd7040+0x0000000000000144]

See
hs_err_pid87757.log

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions