Skip to content

osd, os: drop the OmapIterator in favor of omap_iterate. Remove ObjectMap#60890

Merged
rzarzynski merged 24 commits intoceph:mainfrom
rzarzynski:wip-os-fastomapiter-killomapiterator
Apr 7, 2025
Merged

osd, os: drop the OmapIterator in favor of omap_iterate. Remove ObjectMap#60890
rzarzynski merged 24 commits intoceph:mainfrom
rzarzynski:wip-os-fastomapiter-killomapiterator

Conversation

@rzarzynski
Copy link
Contributor

@rzarzynski rzarzynski commented Nov 29, 2024

This is a follow-up on #60278. In contrast to it, this PR is not supposed to be backported.

Contribution Guidelines

  • To sign and title your commits, please refer to Submitting Patches to Ceph.

  • If you are submitting a fix for a stable branch (e.g. "quincy"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.

  • When filling out the below checklist, you may click boxes directly in the GitHub web UI. When entering or editing the entire PR message in the GitHub web UI editor, you may also select a checklist item by adding an x between the brackets: [x]. Spaces and capitalization matter when checking off items this way.

Checklist

  • Tracker (select at least one)
    • References tracker ticket
    • Very recent bug; references commit where it was introduced
    • New feature (ticket optional)
    • Doc update (no ticket needed)
    • Code cleanup (no ticket needed)
  • Component impact
    • Affects Dashboard, opened tracker ticket
    • Affects Orchestrator, opened tracker ticket
    • No impact that needs to be tracked
  • Documentation (select at least one)
    • Updates relevant documentation
    • No doc update is appropriate
  • Tests (select at least one)
Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test dashboard cephadm
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox
  • jenkins test windows
  • jenkins test rook e2e

@rzarzynski
Copy link
Contributor Author

It turned out that dropping the OmapIterator allowed to squeeze DBObjectMap and, finally, ObjectMap as well.

@rzarzynski rzarzynski changed the title osd, os: drop the OmapIterator in favor of omap_iterate osd, os: drop the OmapIterator in favor of omap_iterate. Remove ObjectMap Dec 11, 2024
#include <map>
#include <optional>
#include <string>
#include <string_view>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(a note that refers to the commit description)
Maybe add, after the perf data, a word about what it is replaced with?

std::string_view value_as_sv() override
{
if (smaller == on_main) {
return main->value_as_sv();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a comment line reminding the reader what the equality means? (I know it is explained in the way 'smaller' is set, but still...)

Copy link
Contributor

@ronen-fr ronen-fr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WIP - I'll continue tomorrow

mapping.snap < end &&
mapping.hoid.pool == pool);
// invalid
dout(10) << __func__ << " stray " << mapping.hoid
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this one really deserves an fmt::format()!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will send a follow up.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The commit is ready, will push after addressing other comments.

};

template <class ClockT = mono_clock>
class time_guard {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is 'guard' the correct name? it's just a collector, with no alarm, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's analogy to the scope_guard we already have.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No sure that's a good reason...

Copy link
Contributor Author

@rzarzynski rzarzynski Dec 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

<BETTER NAME NEED>!

}

// TODO: deduplicate the code, preferrably by removing the string variant
int RocksDBStore::split_key(rocksdb::Slice in, string_view *prefix, string_view *key)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather simply return an optional

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would need to be an optional of pair of two strings (and string_views) – there are callers who rely on the if (param != nullptr) behavior. Sounds complex.

Copy link
Contributor

@ronen-fr ronen-fr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM (and a nice improvement).
Some notes/questions etc.
I'd prefer that Matan also comments on the removal of the snap-mapper format repair code.

static uint32_t crc32_netstring(const uint32_t orig_crc, T data)
{
// this MUST be compliant marshalling format for bufferlist!
// otherwise with scrub during upgrade will happen.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does the 2nd comment line means?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is the assumption on the on-wire format of netstring. Reworking the comment. Thanks!

auto crc = ceph_crc32c(orig_crc, (unsigned char*)&len, sizeof(len));
crc = ceph_crc32c(crc, (unsigned char*)data.c_str(), data.length());

#ifndef _NDEBUG
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that we rarely use _NDEBUG in Ceph code. Is this the right ifdef?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty sure it is. My intention is to have this only in debug builds due to performance penalty.

std::string omap_pos;
int ret = 0;
ceph::buffer::hash data_hash, omap_hash; ///< accumulatinng hash value
ceph::buffer::hash data_hash; ///< accumulatinng hash value
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sp in the comment

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed; will push soon.

if (--max == 0) {
more = true;
pos.omap_pos = key;
return ObjectStore::omap_iter_ret_t::STOP;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If 'max' is 1 when entering, do we update the hashes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously we were. Good catch, thanks!

next_lat_acc,
c->store->cct->_conf->bluestore_log_omap_iterator_age);
return 0;
return more;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather not return bool as int.

src/osd/PGLog.h Outdated
}
missing.add(oid, std::move(item));
} else if (p->key().substr(0, 4) == std::string("dup_")) {
} else if (key.substr(0, 4) == std::string("dup_")) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

starts_with ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactored altogether with missing case.

r = SnapMapper::convert_legacy(cct, store.get(), ch, hoid, max);
if (r < 0)
goto out;
derr << __func__ << " snap_mapper upgrade from pre-octopus"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Matan-B - any problem with this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved here - #60855

ceph::signedspan total_time;
string tail_key;
std::tie(coll, ghobj, first_seek_time, last_seek_time, total_time, tail_key) = *i;
std::tie(coll, ghobj, first_seek_time, last_seek_time, total_time) = *i;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

structured binding instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@rzarzynski
Copy link
Contributor Author

Dropping the legacy key conversion of SnapMapper is sent separately for review: #60855.

@rzarzynski rzarzynski force-pushed the wip-os-fastomapiter-killomapiterator branch 2 times, most recently from 920b388 to 6cb432f Compare December 17, 2024 19:23
@github-actions
Copy link

This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved

When `get_omap_iterator()` is replaced with `omap_iterate()`,
the CRC calculations will be happening in-place, with 0copy.

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
`tail_key`, and even presence of some key designated to
be a gaaurd, seems to an impementation detail of
particular `ObjectStore`.

I doubt this piece of information is useful enough to
introduce a getter for it to the `ObjectStore` interface.

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Please note this also drops a BlueStore's perf counter:
`l_bluestore_omap_seek_to_first_lat`.

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Please don't confuse with osdmaptool.

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
@rzarzynski rzarzynski force-pushed the wip-os-fastomapiter-killomapiterator branch from e452018 to 65b67f4 Compare April 4, 2025 18:22
@rzarzynski
Copy link
Contributor Author

Squashed a fixup and rebased after trivial conflits due to the s/WITH_SEASTAR/WITH_CRIMSON renaming. No need to restart the ongoing QA, IMHO.

@ljflores
Copy link
Member

ljflores commented Apr 7, 2025

@rzarzynski rzarzynski merged commit 5a91525 into ceph:main Apr 7, 2025
12 checks passed
Matan-B added a commit to Matan-B/ceph that referenced this pull request May 26, 2025
Introduced recently in ceph#60890.
_NDEBUG is a typo for NDEBUG. Possibly confused with _DEBUG which is a
Visual Studio concept.

Signed-off-by: Matan Breizman <mbreizma@redhat.com>
@Matan-B Matan-B mentioned this pull request May 26, 2025
14 tasks
Matan-B added a commit to Matan-B/ceph that referenced this pull request May 26, 2025
Introduced recently in ceph#60890.
_NDEBUG is a typo for NDEBUG. Possibly confused with _DEBUG which is a
Visual Studio concept.

Signed-off-by: Matan Breizman <mbreizma@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants