Fix FIFO causing overlapping seqnos in L0 files due to overlapped seqnos between ingested files and memtable's#10777
Conversation
|
@hx235 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
bd0a275 to
e3c745f
Compare
|
@hx235 has updated the pull request. You must reimport the pull request before landing. |
e3c745f to
c0ed7f2
Compare
|
@hx235 has updated the pull request. You must reimport the pull request before landing. |
c0ed7f2 to
c1121da
Compare
|
@hx235 has updated the pull request. You must reimport the pull request before landing. |
c1121da to
df03338
Compare
|
@hx235 has updated the pull request. You must reimport the pull request before landing. |
|
@hx235 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
df03338 to
cded3e7
Compare
|
@hx235 has updated the pull request. You must reimport the pull request before landing. |
|
@hx235 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
ajkr
left a comment
There was a problem hiding this comment.
LGTM! Please mention both bug fixes in HISTORY.md and remove unrelated info from the description
There was a problem hiding this comment.
Should it check prev_file's largest_seqno?
cded3e7 to
e78552c
Compare
|
@hx235 has updated the pull request. You must reimport the pull request before landing. |
e78552c to
b119fa0
Compare
|
@hx235 has updated the pull request. You must reimport the pull request before landing. |
|
Update: fix a corner case where we didn't pass in earliest_memtable_seqno in the path of FIFO's CompactRange |
b119fa0 to
9c21e01
Compare
|
@hx235 has updated the pull request. You must reimport the pull request before landing. |
|
@hx235 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
9c21e01 to
694697c
Compare
|
@hx235 has updated the pull request. You must reimport the pull request before landing. |
694697c to
d90f3c1
Compare
|
@hx235 has updated the pull request. You must reimport the pull request before landing. |
1 similar comment
|
@hx235 has updated the pull request. You must reimport the pull request before landing. |
8fca84b to
94d2ae1
Compare
|
@hx235 has updated the pull request. You must reimport the pull request before landing. |
94d2ae1 to
8cb4715
Compare
|
@hx235 has updated the pull request. You must reimport the pull request before landing. |
8cb4715 to
cfdb48e
Compare
|
@hx235 has updated the pull request. You must reimport the pull request before landing. |
|
@hx235 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
|
@ajkr - added UT. I noticed the merged PR e267909#diff-d8fb3d50749aa69b378de447e3d9cf2f48abe0281437f010b5d61365a7b813fd. I took a look and don't think it will impact our PR but will still stress-run over the weekend before landing, just in case. |
cfdb48e to
ebbe9a0
Compare
|
@hx235 has updated the pull request. You must reimport the pull request before landing. |
|
@hx235 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
ebbe9a0 to
8c65521
Compare
|
@hx235 has updated the pull request. You must reimport the pull request before landing. |
|
@hx235 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
| // And in this case, Flush() will return Status::Corruption() caught by | ||
| // `force_consistency_checks=1` | ||
| EXPECT_EQ(3, NumTableFilesAtLevel(0)); | ||
| EXPECT_OK(Flush()); |
There was a problem hiding this comment.
Whether this should fail or not without the fix is an open question (see my other complaints about the existence of the smallest_seqno validation). It might be interesting to check Get(Key(1)) returned wrong result before the fix (without the smallest_seqno validation) and the correct result after the fix.
There was a problem hiding this comment.
oh right - good idea. Will fix this.
| "PickCompactionToWarm", | ||
| "CompactRange", "CompactFile")); | ||
|
|
||
| TEST_P(DBCompactionTestFIFOCheckConsistencyWithParam, |
Summary: **Context/Summary:** #10777 mistakenly added a history entry under 7.8 release but the PR is not included in 7.8. This mistake was due to rebase and merge didn't realize it was a conflict when "## Unreleased" was changed to "## 7.8.0 (10/22/2022)". Pull Request resolved: #10898 Test Plan: Make check Reviewed By: akankshamahajan15 Differential Revision: D40861001 Pulled By: hx235 fbshipit-source-id: b2310c95490f6ebb90834a210c965a74c9560b51
…pped seqnos between ingested files and memtable's (facebook#10777)" This reverts commit fc74abb.
…rash test (#11063) Summary: **Context/Summary:** #10777 was reverted (#10999) due to internal blocker and replaced with a better fix #10922. However, the revert also reverted the `Options::CompactionOptionsFIFO::allow_compaction` stress/crash coverage added by the PR. It's an useful coverage cuz setting `Options::CompactionOptionsFIFO::allow_compaction=true` will [increase](https://github.com/facebook/rocksdb/blob/7.8.fb/db/version_set.cc#L3255) the compaction score of L0 files for FIFO and then trigger more FIFO compaction. This speed up discovery of bug related to FIFO compaction like #10955. To see the speedup, compare the failure occurrence in following commands with `Options::CompactionOptionsFIFO::allow_compaction=true/false` ``` --fifo_allow_compaction=1 --acquire_snapshot_one_in=10000 --adaptive_readahead=0 --allow_concurrent_memtable_write=0 --allow_data_in_errors=True --async_io=1 --avoid_flush_during_recovery=0 --avoid_unnecessary_blocking_io=1 --backup_max_size=104857600 --backup_one_in=100000 --batch_protection_bytes_per_key=0 --block_size=16384 --bloom_bits=8.869062094789008 --bottommost_compression_type=none --bytes_per_sync=0 --cache_index_and_filter_blocks=1 --cache_size=8388608 --cache_type=lru_cache --charge_compression_dictionary_building_buffer=1 --charge_file_metadata=1 --charge_filter_construction=0 --charge_table_reader=1 --checkpoint_one_in=1000000 --checksum_type=kxxHash --clear_column_family_one_in=0 --column_families=1 --compact_files_one_in=1000000 --compact_range_one_in=1000000 --compaction_pri=3 --compaction_style=2 --compaction_ttl=0 --compression_max_dict_buffer_bytes=8589934591 --compression_max_dict_bytes=16384 --compression_parallel_threads=1 --compression_type=xpress --compression_use_zstd_dict_trainer=1 --compression_zstd_max_train_bytes=0 --continuous_verification_interval=0 --data_block_index_type=0 --db=/dev/shm/rocksdb_test/rocksdb_crashtest_whitebox --db_write_buffer_size=1048576 --delpercent=4 --delrangepercent=1 --destroy_db_initially=1 --detect_filter_construct_corruption=0 --disable_wal=0 --enable_compaction_filter=0 --enable_pipelined_write=1 --expected_values_dir=/dev/shm/rocksdb_test/rocksdb_crashtest_expected --fail_if_options_file_error=1 --file_checksum_impl=xxh64 --flush_one_in=1000000 --format_version=4 --get_current_wal_file_one_in=0 --get_live_files_one_in=1000000 --get_property_one_in=1000000 --get_sorted_wal_files_one_in=0 --index_block_restart_interval=10 --index_type=2 --ingest_external_file_one_in=100 --initial_auto_readahead_size=16384 --iterpercent=10 --key_len_percent_dist=1,30,69 --level_compaction_dynamic_level_bytes=False --log2_keys_per_lock=10 --long_running_snapshots=0 --manual_wal_flush_one_in=0 --mark_for_compaction_one_file_in=10 --max_auto_readahead_size=524288 --max_background_compactions=1 --max_bytes_for_level_base=67108864 --max_key=25000000 --max_key_len=3 --max_manifest_file_size=1073741824 --max_write_batch_group_size_bytes=1048576 --max_write_buffer_number=3 --max_write_buffer_size_to_maintain=0 --memtable_prefix_bloom_size_ratio=0.01 --memtable_protection_bytes_per_key=1 --memtable_whole_key_filtering=1 --memtablerep=skip_list --min_write_buffer_number_to_merge=2 --mmap_read=0 --mock_direct_io=True --nooverwritepercent=1 --num_file_reads_for_auto_readahead=2 --open_files=-1 --open_metadata_write_fault_one_in=0 --open_read_fault_one_in=0 --open_write_fault_one_in=0 --ops_per_thread=40000 --optimize_filters_for_memory=0 --paranoid_file_checks=1 --partition_filters=0 --partition_pinning=3 --pause_background_one_in=1000000 --periodic_compaction_seconds=0 --prefix_size=7 --prefixpercent=5 --prepopulate_block_cache=0 --preserve_internal_time_seconds=3600 --progress_reports=0 --read_fault_one_in=1000 --readahead_size=0 --readpercent=15 --recycle_log_file_num=1 --reopen=0 --ribbon_starting_level=999 --secondary_cache_fault_one_in=0 --snapshot_hold_ops=100000 --sst_file_manager_bytes_per_sec=0 --sst_file_manager_bytes_per_truncate=0 --stats_dump_period_sec=0 --subcompactions=2 --sync=0 --sync_fault_injection=0 --target_file_size_base=16777216 --target_file_size_multiplier=1 --test_batches_snapshots=0 --top_level_index_pinning=1 --unpartitioned_pinning=1 --use_direct_io_for_flush_and_compaction=1 --use_direct_reads=1 --use_full_merge_v1=1 --use_merge=0 --use_multiget=0 --use_put_entity_one_in=0 --user_timestamp_size=0 --value_size_mult=32 --verify_checksum=1 --verify_checksum_one_in=1000000 --verify_db_one_in=100000 --verify_iterator_with_expected_state_one_in=5 --verify_sst_unique_id_in_manifest=1 --wal_bytes_per_sync=0 --wal_compression=none --write_buffer_size=33554432 --write_dbid_to_manifest=1 --writepercent=65 ``` Therefore this PR is adding it back to stress/crash test. Pull Request resolved: #11063 Test Plan: Rehearsal stress test to make sure stress/crash test is stable Reviewed By: ajkr Differential Revision: D42283650 Pulled By: hx235 fbshipit-source-id: 132e6396ab6e24d8dcb8fe51c62dd5211cdf53ef
Context:
Same as #5958 (comment) but apply the fix to FIFO Compaction case
Repro:
Summary:
FIFO only does intra-L0 compaction in the following four cases. For other cases, FIFO drops data instead of compacting on data, which is irrelevant to the overlapping seqno issue we are solving.
total size < compaction_options_fifo.max_table_files_sizeandcompaction_options_fifo.allow_compaction == trueFindIntraL0Compactionhttps://github.com/facebook/rocksdb/pull/5958/files#diff-c261f77d6dd2134333c4a955c311cf4a196a08d3c2bb6ce24fd6801407877c89R56fifo.allow_compactionin stress test to surface the overlapping seqno issue we are fixing here.compaction_options_fifo.age_for_warm > 0earliest_mem_seqnoage_for_warmoption worths a separate PR to deal with db stress compatibility. Therefore we manually tested this path for this PRSanitizeCompactionInputFiles()will be called beforeCompactionPicker::CompactFiles, we simply replicate the idea in Fix corruption with intra-L0 on ingested files #5958 (comment) inSanitizeCompactionInputFiles(). To simplify implementation, we returnStats::Abort()on encountering seqno-overlapped file when doing compaction to L0 instead of skipping the file and proceed with the compaction.Some additional clean-up included in this PR:
earliest_memtable_seqnotoearliest_mem_seqnofor consistent namingearliest_memtable_seqnoin related APIsearliest_memtable_seqnoconstant and requiredTest:
TEST_P(DBCompactionTestFIFOCheckConsistencyWithParam, FlushAfterIntraL0CompactionWithIngestedFile)corresponding to the above 4 cases, which will fail accordingly without the fix