Skip to content

Fix stress test failure in ReadAsync.#9824

Closed
akankshamahajan15 wants to merge 1 commit intofacebook:mainfrom
akankshamahajan15:fix_dbstress
Closed

Fix stress test failure in ReadAsync.#9824
akankshamahajan15 wants to merge 1 commit intofacebook:mainfrom
akankshamahajan15:fix_dbstress

Conversation

@akankshamahajan15
Copy link
Copy Markdown
Contributor

@akankshamahajan15 akankshamahajan15 commented Apr 8, 2022

Summary: Fix stress test failure in ReadAsync by ignoring errors
injected during async read by FaultInjectionFS.
Failure:

 WARNING: prefix_size is non-zero but memtablerep != prefix_hash
Didn't get expected error from MultiGet. 
num_keys 14 Expected 1 errors, seen 0
Callstack that injected the fault
Injected error type = 32538
Message: error; 
#0   ./db_stress() [0x6f7dd4] rocksdb::port::SaveStack(int*, int)	/data/sandcastle/boxes/trunk-hg-fbcode-fbsource/fbcode/internal_repo_rocksdb/repo/port/stack_trace.cc:152	
#1   ./db_stress() [0x7f2bda] rocksdb::FaultInjectionTestFS::InjectThreadSpecificReadError(rocksdb::FaultInjectionTestFS::ErrorOperation, rocksdb::Slice*, bool, char*, bool, bool*)	/data/sandcastle/boxes/trunk-hg-fbcode-fbsource/fbcode/internal_repo_rocksdb/repo/utilities/fault_injection_fs.cc:891	
#2   ./db_stress() [0x7f2e78] rocksdb::TestFSRandomAccessFile::Read(unsigned long, unsigned long, rocksdb::IOOptions const&, rocksdb::Slice*, char*, rocksdb::IODebugContext*) const	/data/sandcastle/boxes/trunk-hg-fbcode-fbsource/fbcode/internal_repo_rocksdb/repo/utilities/fault_injection_fs.cc:367	
#3   ./db_stress() [0x6483d7] rocksdb::(anonymous namespace)::CompositeRandomAccessFileWrapper::Read(unsigned long, unsigned long, rocksdb::Slice*, char*) const	/data/sandcastle/boxes/trunk-hg-fbcode-fbsource/fbcode/internal_repo_rocksdb/repo/env/composite_env.cc:61	
#4   ./db_stress() [0x654564] rocksdb::(anonymous namespace)::LegacyRandomAccessFileWrapper::Read(unsigned long, unsigned long, rocksdb::IOOptions const&, rocksdb::Slice*, char*, rocksdb::IODebugContext*) const	/data/sandcastle/boxes/trunk-hg-fbcode-fbsource/fbcode/internal_repo_rocksdb/repo/env/env.cc:152	
#5   ./db_stress() [0x659b3b] rocksdb::FSRandomAccessFile::ReadAsync(rocksdb::FSReadRequest&, rocksdb::IOOptions const&, std::function<void (rocksdb::FSReadRequest const&, void*)>, void*, void**, std::function<void (void*)>*, rocksdb::IODebugContext*)	/data/sandcastle/boxes/trunk-hg-fbcode-fbsource/fbcode/internal_repo_rocksdb/repo/./include/rocksdb/file_system.h:896	
#6   ./db_stress() [0x8b8bab] rocksdb::RandomAccessFileReader::ReadAsync(rocksdb::FSReadRequest&, rocksdb::IOOptions const&, std::function<void (rocksdb::FSReadRequest const&, void*)>, void*, void**, std::function<void (void*)>*, rocksdb::Env::IOPriority)	/data/sandcastle/boxes/trunk-hg-fbcode-fbsource/fbcode/internal_repo_rocksdb/repo/file/random_access_file_reader.cc:459	
#7   ./db_stress() [0x8b501f] rocksdb::FilePrefetchBuffer::ReadAsync(rocksdb::IOOptions const&, rocksdb::RandomAccessFileReader*, rocksdb::Env::IOPriority, unsigned long, unsigned long, unsigned long, unsigned int)	/data/sandcastle/boxes/trunk-hg-fbcode-fbsource/fbcode/internal_repo_rocksdb/repo/file/file_prefetch_buffer.cc:124	
#8   ./db_stress() [0x8b55fc] rocksdb::FilePrefetchBuffer::PrefetchAsync(rocksdb::IOOptions const&, rocksdb::RandomAccessFileReader*, unsigned long, unsigned long, unsigned long, rocksdb::Env::IOPriority, bool&)	/data/sandcastle/boxes/trunk-hg-fbcode-fbsource/fbcode/internal_repo_rocksdb/repo/file/file_prefetch_buffer.cc:363	
#9   ./db_stress() [0x8b61f8] rocksdb::FilePrefetchBuffer::TryReadFromCacheAsync(rocksdb::IOOptions const&, rocksdb::RandomAccessFileReader*, unsigned long, unsigned long, rocksdb::Slice*, rocksdb::Status*, rocksdb::Env::IOPriority, bool)	/data/sandcastle/boxes/trunk-hg-fbcode-fbsource/fbcode/internal_repo_rocksdb/repo/file/file_prefetch_buffer.cc:482	
#10  ./db_stress() [0x745e04] rocksdb::BlockFetcher::TryGetFromPrefetchBuffer()	/data/sandcastle/boxes/trunk-hg-fbcode-fbsource/fbcode/internal_repo_rocksdb/repo/table/block_fetcher.cc:76

Test Plan:

./db_stress --acquire_snapshot_one_in=10000 --adaptive_readahead=1 --allow_concurrent_memtable_write=0 --async_io=1 --atomic_flush=1 --avoid_flush_during_recovery=0 --avoid_unnecessary_blocking_io=0 -- backup_max_size=104857600 --backup_one_in=100000 --batch_protection_bytes_per_key=0 --block_size=16384 --bloom_bits=5.037629726741734 --bottommost_compression_type=lz4hc --cache_index_and_filter_blocks=0 --cache_size=8388608 --checkpoint_one_in=1000000 --checksum_type=kxxHash --clear_column_family_one_in=0 --column_families=1 --compact_files_one_in=1000000 --compact_range_one_in=1000000 --compaction_ttl=100 --compression_max_dict_buffer_bytes=1073741823 --compression_max_dict_bytes=16384 --compression_parallel_threads=1 --compression_type=zstd --compression_zstd_max_train_bytes=0 --continuous_verification_interval=0 --db=/home/akankshamahajan/dev/shm/rocksdb/rocksdb_crashtest_blackbox --db_write_buffer_size=8388608 --delpercent=0 --delrangepercent=0 --destroy_db_initially=0 - detect_filter_construct_corruption=1 --disable_wal=1 --enable_compaction_filter=0 --enable_pipelined_write=0 --expected_values_dir=/home/akankshamahajan/dev/shm/rocksdb/rocksdb_crashtest_expected --experimental_mempurge_threshold=8.772789063014715 --fail_if_options_file_error=0 --file_checksum_impl=crc32c --flush_one_in=1000000 --format_version=3 --get_current_wal_file_one_in=0 --get_live_files_one_in=1000000 --get_property_one_in=1000000 --get_sorted_wal_files_one_in=0 --index_block_restart_interval=15 --index_type=3 --iterpercent=0 --key_len_percent_dist=1,30,69 --level_compaction_dynamic_level_bytes=False --long_running_snapshots=0 --mark_for_compaction_one_file_in=0 --max_background_compactions=1 --max_bytes_for_level_base=67108864 --max_key=25000000 --max_key_len=3 --max_manifest_file_size=1073741824 --max_write_batch_group_size_bytes=16777216 --max_write_buffer_number=3 --max_write_buffer_size_to_maintain=2097152 --memtable_prefix_bloom_size_ratio=0.001 --memtable_whole_key_filtering=1 --memtablerep=skip_list --mmap_read=0 --mock_direct_io=True --nooverwritepercent=1 --open_files=-1 --open_metadata_write_fault_one_in=0 --open_read_fault_one_in=0 --open_write_fault_one_in=0 --ops_per_thread=100000000 --optimize_filters_for_memory=0 --paranoid_file_checks=1 --partition_filters=0 --partition_pinning=2 --pause_background_one_in=1000000 --periodic_compaction_seconds=1000 --prefix_size=-1 --prefixpercent=0 --prepopulate_block_cache=0 --progress_reports=0 --read_fault_one_in=32 --readpercent=100 --recycle_log_file_num=1 --reopen=0 --reserve_table_reader_memory=1 --ribbon_starting_level=999 --secondary_cache_fault_one_in=0 --set_options_one_in=0 --snapshot_hold_ops=100000 --sst_file_manager_bytes_per_sec=0 --sst_file_manager_bytes_per_truncate=0 --subcompactions=2 --sync=0 --sync_fault_injection=False --target_file_size_base=16777216 --target_file_size_multiplier=1 --test_batches_snapshots=0 --top_level_index_pinning=3 --unpartitioned_pinning=2 --use_block_based_filter=0 --use_clock_cache=0 --use_direct_io_for_flush_and_compaction=1 --use_direct_reads=0 --use_full_merge_v1=0 --use_merge=1 --use_multiget=1 --user_timestamp_size=0 --value_size_mult=32 --verify_checksum=1 --verify_checksum_one_in=1000000 --verify_db_one_in=100000 --wal_compression=none --write_buffer_size=33554432 --write_dbid_to_manifest=1 --write_fault_one_in=0 --writepercent=0

Reviewers:

Subscribers:

Tasks:

Tags:

Summary: Fix stress test failure in ReadAsync by ignoring errors
injected during async read by FaultInjectionFS.

Test Plan:  ./db_stress --acquire_snapshot_one_in=10000 --adaptive_readahead=1 --allow_concurrent_memtable_write=0 --async_io=1
--atomic_flush=1 --avoid_flush_during_recovery=0 --avoid_unnecessary_blocking_io=0 --backup_max_size=104857600
--backup_one_in=100000 --batch_protection_bytes_per_key=0 --block_size=16384 --bloom_bits=5.037629726741734 --bottommost_compression_type=lz4hc --cache_index_and_filter_blocks=0
--cache_size=8388608 --checkpoint_one_in=1000000 --checksum_type=kxxHash
--clear_column_family_one_in=0 --column_families=1
--compact_files_one_in=1000000 --compact_range_one_in=1000000
--compaction_ttl=100 --compression_max_dict_buffer_bytes=1073741823
--compression_max_dict_bytes=16384 --compression_parallel_threads=1
--compression_type=zstd --compression_zstd_max_train_bytes=0 --continuous_verification_interval=0 --db=/home/akankshamahajan/dev/shm/rocksdb/rocksdb_crashtest_blackbox
--db_write_buffer_size=8388608 --delpercent=0 --delrangepercent=0
--destroy_db_initially=0 --detect_filter_construct_corruption=1
--disable_wal=1 --enable_compaction_filter=0 --enable_pipelined_write=0
--expected_values_dir=/home/akankshamahajan/dev/shm/rocksdb/rocksdb_crashtest_expected
--experimental_mempurge_threshold=8.772789063014715
--fail_if_options_file_error=0 --file_checksum_impl=crc32c
--flush_one_in=1000000 --format_version=3
--get_current_wal_file_one_in=0 --get_live_files_one_in=1000000
--get_property_one_in=1000000 --get_sorted_wal_files_one_in=0
--index_block_restart_interval=15 --index_type=3 --iterpercent=0
--key_len_percent_dist=1,30,69
--level_compaction_dynamic_level_bytes=False --long_running_snapshots=0
--mark_for_compaction_one_file_in=0 --max_background_compactions=1
--max_bytes_for_level_base=67108864 --max_key=25000000 --max_key_len=3
--max_manifest_file_size=1073741824
--max_write_batch_group_size_bytes=16777216 --max_write_buffer_number=3
--max_write_buffer_size_to_maintain=2097152
--memtable_prefix_bloom_size_ratio=0.001
--memtable_whole_key_filtering=1 --memtablerep=skip_list --mmap_read=0
--mock_direct_io=True --nooverwritepercent=1 --open_files=-1
--open_metadata_write_fault_one_in=0 --open_read_fault_one_in=0
--open_write_fault_one_in=0 --ops_per_thread=100000000
--optimize_filters_for_memory=0 --paranoid_file_checks=1
--partition_filters=0 --partition_pinning=2
--pause_background_one_in=1000000 --periodic_compaction_seconds=1000
--prefix_size=-1 --prefixpercent=0 --prepopulate_block_cache=0
--progress_reports=0 --read_fault_one_in=32 --readpercent=100
--recycle_log_file_num=1 --reopen=0 --reserve_table_reader_memory=1
--ribbon_starting_level=999 --secondary_cache_fault_one_in=0
--set_options_one_in=0 --snapshot_hold_ops=100000
--sst_file_manager_bytes_per_sec=0
--sst_file_manager_bytes_per_truncate=0 --subcompactions=2 --sync=0
--sync_fault_injection=False --target_file_size_base=16777216
--target_file_size_multiplier=1 --test_batches_snapshots=0
--top_level_index_pinning=3 --unpartitioned_pinning=2
--use_block_based_filter=0 --use_clock_cache=0
--use_direct_io_for_flush_and_compaction=1 --use_direct_reads=0
--use_full_merge_v1=0 --use_merge=1 --use_multiget=1
--user_timestamp_size=0 --value_size_mult=32 --verify_checksum=1
--verify_checksum_one_in=1000000 --verify_db_one_in=100000
--wal_compression=none --write_buffer_size=33554432 --write_dbid_to_manifest=1 --write_fault_one_in=0 --writepercent=0

Reviewers:

Subscribers:

Tasks:

Tags:
@facebook-github-bot
Copy link
Copy Markdown
Contributor

@akankshamahajan15 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

1 similar comment
@facebook-github-bot
Copy link
Copy Markdown
Contributor

@akankshamahajan15 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

pingyu pushed a commit to pingyu/rocksdb that referenced this pull request Oct 23, 2022
Summary:
Fix stress test failure in ReadAsync by ignoring errors
injected during async read by FaultInjectionFS.
Failure:
```
 WARNING: prefix_size is non-zero but memtablerep != prefix_hash
Didn't get expected error from MultiGet.
num_keys 14 Expected 1 errors, seen 0
Callstack that injected the fault
Injected error type = 32538
Message: error;
#0   ./db_stress() [0x6f7dd4] rocksdb::port::SaveStack(int*, int)	/data/sandcastle/boxes/trunk-hg-fbcode-fbsource/fbcode/internal_repo_rocksdb/repo/port/stack_trace.cc:152
facebook#1   ./db_stress() [0x7f2bda] rocksdb::FaultInjectionTestFS::InjectThreadSpecificReadError(rocksdb::FaultInjectionTestFS::ErrorOperation, rocksdb::Slice*, bool, char*, bool, bool*)	/data/sandcastle/boxes/trunk-hg-fbcode-fbsource/fbcode/internal_repo_rocksdb/repo/utilities/fault_injection_fs.cc:891
facebook#2   ./db_stress() [0x7f2e78] rocksdb::TestFSRandomAccessFile::Read(unsigned long, unsigned long, rocksdb::IOOptions const&, rocksdb::Slice*, char*, rocksdb::IODebugContext*) const	/data/sandcastle/boxes/trunk-hg-fbcode-fbsource/fbcode/internal_repo_rocksdb/repo/utilities/fault_injection_fs.cc:367
facebook#3   ./db_stress() [0x6483d7] rocksdb::(anonymous namespace)::CompositeRandomAccessFileWrapper::Read(unsigned long, unsigned long, rocksdb::Slice*, char*) const	/data/sandcastle/boxes/trunk-hg-fbcode-fbsource/fbcode/internal_repo_rocksdb/repo/env/composite_env.cc:61
facebook#4   ./db_stress() [0x654564] rocksdb::(anonymous namespace)::LegacyRandomAccessFileWrapper::Read(unsigned long, unsigned long, rocksdb::IOOptions const&, rocksdb::Slice*, char*, rocksdb::IODebugContext*) const	/data/sandcastle/boxes/trunk-hg-fbcode-fbsource/fbcode/internal_repo_rocksdb/repo/env/env.cc:152
facebook#5   ./db_stress() [0x659b3b] rocksdb::FSRandomAccessFile::ReadAsync(rocksdb::FSReadRequest&, rocksdb::IOOptions const&, std::function<void (rocksdb::FSReadRequest const&, void*)>, void*, void**, std::function<void (void*)>*, rocksdb::IODebugContext*)	/data/sandcastle/boxes/trunk-hg-fbcode-fbsource/fbcode/internal_repo_rocksdb/repo/./include/rocksdb/file_system.h:896
facebook#6   ./db_stress() [0x8b8bab] rocksdb::RandomAccessFileReader::ReadAsync(rocksdb::FSReadRequest&, rocksdb::IOOptions const&, std::function<void (rocksdb::FSReadRequest const&, void*)>, void*, void**, std::function<void (void*)>*, rocksdb::Env::IOPriority)	/data/sandcastle/boxes/trunk-hg-fbcode-fbsource/fbcode/internal_repo_rocksdb/repo/file/random_access_file_reader.cc:459
facebook#7   ./db_stress() [0x8b501f] rocksdb::FilePrefetchBuffer::ReadAsync(rocksdb::IOOptions const&, rocksdb::RandomAccessFileReader*, rocksdb::Env::IOPriority, unsigned long, unsigned long, unsigned long, unsigned int)	/data/sandcastle/boxes/trunk-hg-fbcode-fbsource/fbcode/internal_repo_rocksdb/repo/file/file_prefetch_buffer.cc:124
facebook#8   ./db_stress() [0x8b55fc] rocksdb::FilePrefetchBuffer::PrefetchAsync(rocksdb::IOOptions const&, rocksdb::RandomAccessFileReader*, unsigned long, unsigned long, unsigned long, rocksdb::Env::IOPriority, bool&)	/data/sandcastle/boxes/trunk-hg-fbcode-fbsource/fbcode/internal_repo_rocksdb/repo/file/file_prefetch_buffer.cc:363
facebook#9   ./db_stress() [0x8b61f8] rocksdb::FilePrefetchBuffer::TryReadFromCacheAsync(rocksdb::IOOptions const&, rocksdb::RandomAccessFileReader*, unsigned long, unsigned long, rocksdb::Slice*, rocksdb::Status*, rocksdb::Env::IOPriority, bool)	/data/sandcastle/boxes/trunk-hg-fbcode-fbsource/fbcode/internal_repo_rocksdb/repo/file/file_prefetch_buffer.cc:482
facebook#10  ./db_stress() [0x745e04] rocksdb::BlockFetcher::TryGetFromPrefetchBuffer()	/data/sandcastle/boxes/trunk-hg-fbcode-fbsource/fbcode/internal_repo_rocksdb/repo/table/block_fetcher.cc:76
```

Pull Request resolved: facebook#9824

Test Plan:
```
./db_stress --acquire_snapshot_one_in=10000 --adaptive_readahead=1 --allow_concurrent_memtable_write=0 --async_io=1 --atomic_flush=1 --avoid_flush_during_recovery=0 --avoid_unnecessary_blocking_io=0 -- backup_max_size=104857600 --backup_one_in=100000 --batch_protection_bytes_per_key=0 --block_size=16384 --bloom_bits=5.037629726741734 --bottommost_compression_type=lz4hc --cache_index_and_filter_blocks=0 --cache_size=8388608 --checkpoint_one_in=1000000 --checksum_type=kxxHash --clear_column_family_one_in=0 --column_families=1 --compact_files_one_in=1000000 --compact_range_one_in=1000000 --compaction_ttl=100 --compression_max_dict_buffer_bytes=1073741823 --compression_max_dict_bytes=16384 --compression_parallel_threads=1 --compression_type=zstd --compression_zstd_max_train_bytes=0 --continuous_verification_interval=0 --db=/home/akankshamahajan/dev/shm/rocksdb/rocksdb_crashtest_blackbox --db_write_buffer_size=8388608 --delpercent=0 --delrangepercent=0 --destroy_db_initially=0 - detect_filter_construct_corruption=1 --disable_wal=1 --enable_compaction_filter=0 --enable_pipelined_write=0 --expected_values_dir=/home/akankshamahajan/dev/shm/rocksdb/rocksdb_crashtest_expected --experimental_mempurge_threshold=8.772789063014715 --fail_if_options_file_error=0 --file_checksum_impl=crc32c --flush_one_in=1000000 --format_version=3 --get_current_wal_file_one_in=0 --get_live_files_one_in=1000000 --get_property_one_in=1000000 --get_sorted_wal_files_one_in=0 --index_block_restart_interval=15 --index_type=3 --iterpercent=0 --key_len_percent_dist=1,30,69 --level_compaction_dynamic_level_bytes=False --long_running_snapshots=0 --mark_for_compaction_one_file_in=0 --max_background_compactions=1 --max_bytes_for_level_base=67108864 --max_key=25000000 --max_key_len=3 --max_manifest_file_size=1073741824 --max_write_batch_group_size_bytes=16777216 --max_write_buffer_number=3 --max_write_buffer_size_to_maintain=2097152 --memtable_prefix_bloom_size_ratio=0.001 --memtable_whole_key_filtering=1 --memtablerep=skip_list --mmap_read=0 --mock_direct_io=True --nooverwritepercent=1 --open_files=-1 --open_metadata_write_fault_one_in=0 --open_read_fault_one_in=0 --open_write_fault_one_in=0 --ops_per_thread=100000000 --optimize_filters_for_memory=0 --paranoid_file_checks=1 --partition_filters=0 --partition_pinning=2 --pause_background_one_in=1000000 --periodic_compaction_seconds=1000 --prefix_size=-1 --prefixpercent=0 --prepopulate_block_cache=0 --progress_reports=0 --read_fault_one_in=32 --readpercent=100 --recycle_log_file_num=1 --reopen=0 --reserve_table_reader_memory=1 --ribbon_starting_level=999 --secondary_cache_fault_one_in=0 --set_options_one_in=0 --snapshot_hold_ops=100000 --sst_file_manager_bytes_per_sec=0 --sst_file_manager_bytes_per_truncate=0 --subcompactions=2 --sync=0 --sync_fault_injection=False --target_file_size_base=16777216 --target_file_size_multiplier=1 --test_batches_snapshots=0 --top_level_index_pinning=3 --unpartitioned_pinning=2 --use_block_based_filter=0 --use_clock_cache=0 --use_direct_io_for_flush_and_compaction=1 --use_direct_reads=0 --use_full_merge_v1=0 --use_merge=1 --use_multiget=1 --user_timestamp_size=0 --value_size_mult=32 --verify_checksum=1 --verify_checksum_one_in=1000000 --verify_db_one_in=100000 --wal_compression=none --write_buffer_size=33554432 --write_dbid_to_manifest=1 --write_fault_one_in=0 --writepercent=0
```

Reviewed By: anand1976

Differential Revision: D35514566

Pulled By: akankshamahajan15

fbshipit-source-id: e2a868fdd7422604774c1419738f9926a21e92a4
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants