Skip to content

cls/cmpomap: empty values are 0 in U64 comparisons#42740

Merged
cbodley merged 1 commit intoceph:masterfrom
cbodley:wip-52128
Aug 20, 2021
Merged

cls/cmpomap: empty values are 0 in U64 comparisons#42740
cbodley merged 1 commit intoceph:masterfrom
cbodley:wip-52128

Conversation

@cbodley
Copy link
Contributor

@cbodley cbodley commented Aug 10, 2021

previously, when trying to use cmpomap interfaces on an omap key with an empty value, U64 comparisons would fail to decode with -EIO. so cmp_set_vals() and cmp_rm_keys() are unable to update or remove such keys

for backward-compatibility with rgw's data sync error repo, where the keys used to have empty values, enable these comparisons by treating an empty value as 0

Fixes: https://tracker.ceph.com/issues/52128

Checklist

  • References tracker ticket
  • Updates documentation if necessary
  • Includes tests for new functionality or reproducer for bug

Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox

@cbodley
Copy link
Contributor Author

cbodley commented Aug 10, 2021

tested with a two-zone configuration with mstart.sh. instead of testing a real upgrade, i manually injected an omap key with:

~/ceph/build $ bin/rados -c run/c2/ceph.conf -p na-2.rgw.log setomapval datalog.sync-status.shard.e4479890-0269-4dbb-9797-bb526937c491.0.retry fakebucketname ''

without this fix, radosgw-admin sync status would show recovering shards: [0] forever. it would retry sync on that key, then try to remove it with cmpomap.cmp_rm_keys. but the key comparison failed, so the key never gets removed

with this fix, this injected key is removed successfully and radosgw-admin sync status catches up as expected

Copy link
Contributor

@adamemerson adamemerson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let it be merged.

@yuvalif
Copy link
Contributor

yuvalif commented Aug 11, 2021

the code generic and may be used in other contexts. having "" == 0 might be considered wrong depending on the usecase.
IMO, would be better to always return "0" (=false) in case of an unknown value (assuming that would be ok for the syncing case)

previously, when trying to use cmpomap interfaces on an omap key with
an empty value, U64 comparisons would fail to decode with -EIO. so
cmp_set_vals() and cmp_rm_keys() are unable to update or remove such
keys

for backward-compatibility with rgw's data sync error repo, where the
keys used to have empty values, enable these comparisons by treating an
empty value as 0

Fixes: https://tracker.ceph.com/issues/52128

Signed-off-by: Casey Bodley <cbodley@redhat.com>
@cbodley
Copy link
Contributor Author

cbodley commented Aug 11, 2021

IMO, would be better to always return "0" (=false) in case of an unknown value (assuming that would be ok for the syncing case)

both cmp_set_vals() and cmp_rm_keys() already treat this -EIO error the same as 0, so you still wouldn't be able to update or remove empty values

    if (r == -EIO) {
      r = 0; // treat EIO as a failed comparison
    }

backward compat with multisite's error repo was a requirement for this feature, but i lost sight of it as we iterated on the design

i updated the client API comments to clarify this new behavior. does that work @yuvalif?

@mattbenjamin
Copy link
Contributor

IMO, would be better to always return "0" (=false) in case of an unknown value (assuming that would be ok for the syncing case)

both cmp_set_vals() and cmp_rm_keys() already treat this -EIO error the same as 0, so you still wouldn't be able to update or remove empty values

    if (r == -EIO) {
      r = 0; // treat EIO as a failed comparison
    }

backward compat with multisite's error repo was a requirement for this feature, but i lost sight of it as we iterated on the design

i updated the client API comments to clarify this new behavior. does that work @yuvalif?

Is there other risk of breaking compatibility with other users? Having an explicit argument to control the changed semantics could maybe be an option?

Matt

@cbodley
Copy link
Contributor Author

cbodley commented Aug 11, 2021

Is there other risk of breaking compatibility with other users?

no, the multisite error repo is the only user

Having an explicit argument to control the changed semantics could maybe be an option?

it could, but i'm not sure it's worth expanding the interfaces for it. these interfaces support both U64 and String comparisons, but this change in behavior is specific to U64. the string comparisons already work as expected against empty omap values (empty strings compare less than non-empty strings), so this is essentially changing the behavior of U64 comparisons against empty values to match

@mattbenjamin
Copy link
Contributor

Is there other risk of breaking compatibility with other users?

no, the multisite error repo is the only user

Having an explicit argument to control the changed semantics could maybe be an option?

it could, but i'm not sure it's worth expanding the interfaces for it. these interfaces support both U64 and String comparisons, but this change in behavior is specific to U64. the string comparisons already work as expected against empty omap values (empty strings compare less than non-empty strings), so this is essentially changing the behavior of U64 comparisons against empty values to match

if we're the only consumer so far, I agree, it's not needed

@cbodley
Copy link
Contributor Author

cbodley commented Aug 20, 2021

@cbodley cbodley merged commit 16b9538 into ceph:master Aug 20, 2021
@cbodley cbodley deleted the wip-52128 branch August 20, 2021 14:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants