Add missing range conflict check between file ingestion and RefitLevel() #10988
Add missing range conflict check between file ingestion and RefitLevel() #10988hx235 wants to merge 4 commits intofacebook:mainfrom
Conversation
18b54d1 to
f7e55d4
Compare
ef1aa23 to
324994d
Compare
324994d to
549d68d
Compare
db/compaction/compaction.cc
Outdated
There was a problem hiding this comment.
[Note] Logic in IsBottommostLevel() does not consider cases like RefitLevel() and file ingestion. So I explicitly excluded them to avoid touching its logic.
There was a problem hiding this comment.
It's still correct to keep the original logic right? I'm fine with either way, just checking.
There was a problem hiding this comment.
The original logic is correct to keep - it's just that the bottommost concept is not applied to RefitLevel (where we can output to L0 when L0 is empty) nor file ingestion (where the input files do not exists in LSM prior to the "file-ingestion compaction). I also didn't want to relax the logic in IsBottommostLevel for RefitLevel or file ingestion cuz they are really here for conflict checking, not for anything else. Updated CompactionReason::kExternalSstIngestion/kRefitLevel comments.
db/db_bloom_filter_test.cc
Outdated
There was a problem hiding this comment.
[Note] Due to some clean up in RefitLevel, this now returns OK. The result of this CompactRange() does not matter for this UT after inspection.
There was a problem hiding this comment.
I tried to figure out why it now returns OK here, it's due to the new conditional check where level is 8 and is empty:
if (vstorage->LevelFiles(level).empty()) {
return Status::OK();
}
it seems that we may incorrectly compute final_output_level:
rocksdb/db/db_impl/db_impl_compaction_flush.cc
Lines 1179 to 1180 in 98d5db5
when base level != last level, which in turn means that we may move the wrong files during
RefitLevel().
There was a problem hiding this comment.
It's due to the new conditional check where level is 8
Correct.
It seems that we may incorrectly compute final_output_level
Thanks! This is an accidental finding which I plan to address it in #11041. The test has been using the wrong final_output_level prior my PR anyway.
There was a problem hiding this comment.
[Note] This is a clean up decided to early return in empty source level case, to avoid below cfd->compaction_picker()->GetRange() called on empty input.
db/external_sst_file_basic_test.cc
Outdated
There was a problem hiding this comment.
[Note] With this PR, file ingestion results will depend on on-going compaction too. Disable auto ones so this test is remained untouched. Also auto-compaction is irrelevant to this test after inspection.
db/external_sst_file_ingestion_job.h
Outdated
There was a problem hiding this comment.
[Note] Wish I could have used std::unique_ptr<> for compaction_input_metdatas_ and file_ingesting_compactions_ and avoid explicit deletion in ~ExternalSstFileIngestionJob(). But this mysterious window failure https://app.circleci.com/pipelines/github/facebook/rocksdb/21691/workflows/80e4fb38-8369-4350-8e46-119878df1fb2/jobs/537236 stopped me from doing so.
db/external_sst_file_test.cc
Outdated
db/external_sst_file_test.cc
Outdated
There was a problem hiding this comment.
[Note] Similar to above, ingest file of overlapped ranges with ongoing compaction needs allow_global_seqno=true just like how it's needed for overlapping ranges with existing files. Same for below.
549d68d to
55ffcd0
Compare
|
@hx235 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
55ffcd0 to
0f366ef
Compare
|
@hx235 has updated the pull request. You must reimport the pull request before landing. |
|
@hx235 has updated the pull request. You must reimport the pull request before landing. |
|
@hx235 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
1 similar comment
|
@hx235 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
8a4e1bf to
8f09fb2
Compare
|
@hx235 has updated the pull request. You must reimport the pull request before landing. |
|
@hx235 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
8f09fb2 to
2aeda64
Compare
|
@hx235 has updated the pull request. You must reimport the pull request before landing. |
|
@hx235 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
2aeda64 to
ad53bce
Compare
|
@hx235 has updated the pull request. You must reimport the pull request before landing. |
|
@hx235 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
Summary: valgrind build for `ExternalSSTFileBasicTest/ExternalSSTFileBasicTest.IngestFileWithMixedValueType` and `ExternalSSTFileBasicTest/ExternalSSTFileBasicTest.IngestFileWithGlobalSeqnoPickedSeqno` started failing (see error message in T141554665). I could not repro but I suspect it is due to file ingestion range overlapping with ongoing compaction, which caused a new global seqno being assigned after #10988. Pull Request resolved: #11070 Test Plan: monitor future valgrind tests result. Reviewed By: hx235 Differential Revision: D42319056 Pulled By: cbi42 fbshipit-source-id: acbcd841a2a15e36b278f39ba514f4b9a6ee43ca
…efitLevel() (facebook#10988)" This reverts commit 9502856.
…efitLevel() (facebook#10988)" This reverts commit 9502856.
…efitLevel() (facebook#10988)" This reverts commit 9502856.
…efitLevel() (facebook#10988)" This reverts commit 9502856.
Context:
File ingestion never checks whether the key range it acts on overlaps with an ongoing RefitLevel() (used in
CompactRange()withchange_level=true). That's because RefitLevel() doesn't register and make its key range known to file ingestion. Though it checks overlapping with other compactions by https://github.com/facebook/rocksdb/blob/7.8.fb/db/external_sst_file_ingestion_job.cc#L998.RefitLevel() (used in
CompactRange()withchange_level=true) doesn't check whether the key range it acts on overlaps with an ongoing file ingestion. That's because file ingestion does not register and make its key range known to other compactions.WaitForIngestFilewas called by background and foreground compactions. They were introduced in 0f88160, 5c64fb6 and 87dfc1d.Above are bugs resulting in two bad consequences:
force_consistency_checks=trueTherefore we decide to introduce range conflict check for file ingestion and RefitLevel() inspired from the existing range conflict check among compactions.
Summary:
Compactionof new compaction reasons:CompactionReason::kExternalSstIngestionandCompactionReason::kRefitLeveland register/unregister them. File ingestion is treated as compaction from L0 to different levels and RefitLevel() as compaction from source level to target level.RangeOverlapWithCompactionwith other ongoing compactions,RegisterCompaction()on this "compaction" before changing the LSM state inVersionStorageInfo, andUnregisterCompaction()after changing.Test:
DBCompactionTestWithOngoingFileIngestionParam*that failed pre-fix and passed afterwards.