Project

General

Profile

Actions

Bug #66575

closed

unittest_rocksdb_option / TestRocksdbOptionParse.cc - test failure due to race hazard

Added by Bill Scales over 1 year ago. Updated 8 months ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Tags (freeform):
Fixed In:
v20.3.0-830-g58e9d14cea
Released In:
Upkeep Timestamp:
2025-07-14T22:42:48+00:00

Description

RocksDBOption test occasionally fails with output like this:

did not load config file, using default settings.
[==========] Running 2 tests from 1 test suite.
[----------] Global test environment set-up.
[----------] 2 tests from RocksDBOption
[ RUN      ] RocksDBOption.simple
[       OK ] RocksDBOption.simple (9 ms)
[ RUN      ] RocksDBOption.interpret
2024-06-20T00:17:00.483+0000 7efd14527380 -1 Errors while parsing config file!
2024-06-20T00:17:00.483+0000 7efd14527380 -1 can't open ceph.conf: (2) No such file or directory
2024-06-20T00:17:00.483+0000 7efd14527380 -1 Errors while parsing config file!
2024-06-20T00:17:00.483+0000 7efd14527380 -1 can't open ceph.conf: (2) No such file or directory
/home/jenkins-build/build/workspace/ceph-pull-requests/src/test/objectstore/TestRocksdbOptionParse.cc:73: Failure
Expected equality of these values:
  15u
    Which is: 15
  thread_list.size()
    Which is: 14
[  FAILED  ] RocksDBOption.interpret (107 ms)
[----------] 2 tests from RocksDBOption (116 ms total)

[----------] Global test environment tear-down
[==========] 2 tests from 1 test suite ran. (116 ms total)
[  PASSED  ] 1 test.
[  FAILED  ] 1 test, listed below:
[  FAILED  ] RocksDBOption.interpret

 1 FAILED TEST

Test is checking that RocksDB has parsed the options and created the correct number of threads.

The problem is that RocksDB does not register its threads until they start running, but returns from initialization as soon as they have been created. There is race hazard where the scheduler might not have bothered to run the threads before the test queries how many threads are registered.

The test case has a 100ms sleep but clearly this isn't always sufficient

Actions #1

Updated by Bill Scales over 1 year ago

  • Pull request ID set to 58194
Actions #4

Updated by Kefu Chai 9 months ago

  • Status changed from New to Resolved
Actions #5

Updated by Upkeep Bot 8 months ago

  • Merge Commit set to 58e9d14ceac5d4897c61d5b879f3d60f113e3ce8
  • Fixed In set to v20.3.0-830-g58e9d14ceac
  • Upkeep Timestamp set to 2025-07-10T23:38:08+00:00
Actions #6

Updated by Upkeep Bot 8 months ago

  • Fixed In changed from v20.3.0-830-g58e9d14ceac to v20.3.0-830-g58e9d14cea
  • Upkeep Timestamp changed from 2025-07-10T23:38:08+00:00 to 2025-07-14T22:42:48+00:00
Actions

Also available in: Atom PDF