Intensify "xxx_one_in"'s default value in crash test#12127
Intensify "xxx_one_in"'s default value in crash test#12127hx235 wants to merge 1 commit intofacebook:mainfrom
Conversation
|
@hx235 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
pdillinger
left a comment
There was a problem hiding this comment.
While I do generally believe this biases the bug finding toward places where bugs are most common, I believe it also widens some areas that could go under-stressed. In particular
- The high occurrence of IO-heavy or DB mutex-heavy operations reduces stress on the fast write and read paths. With HCC, both write path and read path have lock-free algorithms for which stress test is our best defense against regression bugs.
- Consistently high flush rates could mask issues that only show up with long-lived memtables or large SST files, etc.
Perhaps one way you could think about it is that there could be a higher-level random decision about where each run should be on a spectrum between "let the read and write paths flow as freely as possible" and "throw as many wrenches and curveballs into smooth DB operation as possible" and you could derive these other parameters from that one.
In other other words, there is risk in being too consistent with our randomness. So adding some higher-level randomness to the behavioral parameters could reproduce a larger suite of stress conditions. IMHO.
pdillinger
left a comment
There was a problem hiding this comment.
Still overall a better config than before IMHO
Yes - good idea. Let me do that as a follow up. An immediate follow-up could be lambda: random.choice() between the original value and intensified value as @akankshamahajan15 suggested. |
I meant lambda: random.choice([current_value, intensified_value]) That way it only selects either the current value or the intensified value. |
b65af52 to
c20a9cb
Compare
|
@hx235 has updated the pull request. You must reimport the pull request before landing. |
Yes - Sorry - I mistyped it and meant to say original value. Edited it |
|
@hx235 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
A good suggestion to reduce loss of coverage on the "let the read and write paths flow as freely as possible" side of the spectrum! My speculation that there's value in covering even further on that side is... just speculation. |
Summary: **Context/Summary:** Continued from #12127, we can randomly reduce the # max key to coerce more operations on the same key. My experimental run shows it surfaced more issue than just #12127. I also randomly reduce the related parameters, write buffer size and target file base, to adapt to randomly lower number of # max key. This creates 4 situations of testing, 3 of which are new: 1. **high** # max key with **high** write buffer size and target file base (existing) 2. **high** # max key with **low** write buffer size and target file base (new, will go through some rehearsal testing to ensure we don't run out of space with many files) 3. **low** # max key with **high** write buffer size and target file base (new, keys will stay in memory longer) 4. **low** # max key with **low** write buffer size and target file base (new, experimental runs show it surfaced even more issues) Pull Request resolved: #12148 Test Plan: - [Ongoing] Rehearsal stress test - Monitor production stress test Reviewed By: jaykorean Differential Revision: D52174980 Pulled By: hx235 fbshipit-source-id: bd5e11280826819ca9314c69bbbf05d481c6d105
Context/Summary:
My experimental stress runs with more frequent "xxx_one_in" surfaced a couple interesting bugs/issues with RocksDB or crash test framework in the past. We now consider changing the default value so they are run more frequently in production testing environment.
Increase frequency by 2 orders of magnitude for most parameters, except for error-prone features e.g, manual compaction and file ingestion (increased by 3 orders) and expensive features e.g, checksum verification (increased by 1 order)
Test:
Monitor CI to see if it did surface more interesting bugs/issues. If not, we may consider intensify even more.