Skip to content

pacific: revival and backport of fix for RocksDB optimized iterators#46096

Merged
yuriw merged 4 commits intoceph:pacificfrom
aclamk:wip-aclamk-unbounded-wholespace-iterator-pacific
May 4, 2022
Merged

pacific: revival and backport of fix for RocksDB optimized iterators#46096
yuriw merged 4 commits intoceph:pacificfrom
aclamk:wip-aclamk-unbounded-wholespace-iterator-pacific

Conversation

@aclamk
Copy link
Contributor

@aclamk aclamk commented Apr 29, 2022

Revival of:
#45963
Backport of:
#46095

Backport trackers: https://tracker.ceph.com/issues/55518 and https://tracker.ceph.com/issues/55442

Contribution Guidelines

Checklist

  • Tracker (select at least one)
    • References tracker ticket
    • Very recent bug; references commit where it was introduced
    • New feature (ticket optional)
    • Doc update (no ticket needed)
    • Code cleanup (no ticket needed)
  • Component impact
    • Affects Dashboard, opened tracker ticket
    • Affects Orchestrator, opened tracker ticket
    • No impact that needs to be tracked
  • Documentation (select at least one)
    • Updates relevant documentation
    • No doc update is appropriate
  • Tests (select at least one)
Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test dashboard cephadm
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox
  • jenkins test windows

Limits RocksDB omap Seek operations to the relevant key range of the object's omap.
This prevents RocksDB from unnecessarily iterating over delete range tombstones in
irrelevant omap CF shards. Avoids extreme performance degradation commonly caused
by tombstones generated from RGW bucket resharding cleanup. Also prefer CFIteratorImpl
over ShardMergeIteratorImpl when we can determine that all keys within specified
IteratorBounds must be in a single CF.

Fixes: https://tracker.ceph.com/issues/55324
Signed-off-by: Cory Snyder <csnyder@iland.com>
(cherry picked from commit 850c16c)
…isabled

Add osd_rocksdb_iterator_bounds_enabled config option to allow rocksdb iterator bounds to be disabled.
Also includes minor refactoring to shorten code associated with IteratorBounds initialization in bluestore.

Signed-off-by: Cory Snyder <csnyder@iland.com>
(cherry picked from commit ca3ccd9)

Conflicts:
	src/common/options/osd.yaml.in

Cherry-pick notes:
- Conflicts due to option definition in common/options.cc in Pacific vs. common/options/osd.yaml.in in later releases
…rBounds)

Adds a precondition to RocksDBStore::get_cf_handle(string, IteratorBounds)
to avoid duplicating logic of the only caller (RocksDBStore::get_iterator).
Assertions will fail if preconditions are not met.

Signed-off-by: Cory Snyder <csnyder@iland.com>
(cherry picked from commit 55ef16f)
…ounded iterator

Iterator-bounding feature is introduced to make RocksDB iterators limited, so they
would less likely traverse over tombstones.
This is used when listing keys in fixed range, for example OMAPS for specific object.

It is problematic when extending this logic to WholeSpaceIterator,
since prefix must be taken into account.

Fixes: https://tracker.ceph.com/issues/55444

Signed-off-by: Adam Kupczyk <akupczyk@redhat.com>
@aclamk aclamk force-pushed the wip-aclamk-unbounded-wholespace-iterator-pacific branch from f150a99 to 9cdb2c1 Compare April 29, 2022 22:56
@neha-ojha
Copy link
Member

jenkins test make check

1 similar comment
@neha-ojha
Copy link
Member

jenkins test make check

@neha-ojha neha-ojha requested a review from ifed01 April 30, 2022 19:44
Copy link
Contributor

@idryomov idryomov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The backport of #46095 fixes it for my TestClsJournal upgrade reproducer.

@BenoitKnecht
Copy link
Contributor

This looks great! We're heavily impacted by https://tracker.ceph.com/issues/55324, so I really hope this PR can make it into 16.2.8! ❤️

@BenoitKnecht
Copy link
Contributor

I've tested this PR (applied to 16.2.7) on one of our test clusters with a lot of OMAP data. Without this PR, any rebalancing on the OMAP OSDs would lead to terrible performance until we compacted RocksDB to get rid of the tombstones, but with this PR applied, we don't get any performance hit (or small enough that we don't even notice).

@neha-ojha
Copy link
Member

I've tested this PR (applied to 16.2.7) on one of our test clusters with a lot of OMAP data. Without this PR, any rebalancing on the OMAP OSDs would lead to terrible performance until we compacted RocksDB to get rid of the tombstones, but with this PR applied, we don't get any performance hit (or small enough that we don't even notice).

Thanks for testing this PR! It will be included in 16.2.8.

Copy link
Member

@neha-ojha neha-ojha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yuriw yuriw merged commit 73636a1 into ceph:pacific May 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants