Project

General

Profile

Actions

Bug #67179

closed

Make rm-pg-upmap-primary able to remove mappings by force

Added by Laura Flores over 1 year ago. Updated 5 months ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Backport:
reef,squid
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Tags (freeform):
backport_processed
Fixed In:
v20.0.0-557-g8d229a9eba
Released In:
v20.2.0~833
Upkeep Timestamp:
2025-11-01T01:32:43+00:00

Description

Even though it should not be possible for invalid pg-upmap-primary mappings to be in the osdmap, users may need to remove mappings for pgs that no longer exist due to the bug tracked in https://tracker.ceph.com/issues/66867. This can be achieved by adding a "--force" flag to the existing CLI command.


Files

osd.map (28.3 KB) osd.map Reily Siegel, 07/29/2024 09:15 PM

Related issues 5 (1 open4 closed)

Related to RADOS - Bug #66867: pg_upmap_primary items are retained in OSD map for a pool which is already deletedResolvedLaura Flores

Actions
Related to Ceph - Bug #67265: Make sure *pgupmap* entries are removed when changing number of pgs per poolPending BackportLaura Flores

Actions
Has duplicate RADOS - Bug #69760: Monitors crash largely due to the structure of pg-upmap-primaryDuplicateLaura Flores

Actions
Copied to RADOS - Backport #70591: reef: Make rm-pg-upmap-primary able to remove mappings by forceResolvedLaura FloresActions
Copied to RADOS - Backport #70592: squid: Make rm-pg-upmap-primary able to remove mappings by forceResolvedLaura FloresActions
Actions #1

Updated by Laura Flores over 1 year ago

  • Tracker changed from Enhancement to Bug
  • Regression set to No
  • Severity set to 3 - minor
Actions #2

Updated by Laura Flores over 1 year ago

  • Priority changed from Normal to High
Actions #3

Updated by Laura Flores over 1 year ago

  • Related to Bug #66867: pg_upmap_primary items are retained in OSD map for a pool which is already deleted added
Actions #4

Updated by Laura Flores over 1 year ago

  • Assignee set to Laura Flores
Actions #5

Updated by Laura Flores over 1 year ago

  • Backport set to reef,squid
Actions #6

Updated by Radoslaw Zarzynski over 1 year ago

Do we have similar thing for upmap-items & co?

Actions #7

Updated by Reily Siegel over 1 year ago

I've encountered this issue on my cluster, version 18.2.1. As requested in slack, here is my OSD map.

Actions #8

Updated by Josh Salomon over 1 year ago

Technically this is a very simple fix - the rm-pgupmap-* commands should not check that the pg exists during the command parsing since they make the check later to see if the pg exists in the pgupmap maps. The problem should not happen since during pool removal the pg-upmap-* records for the pools are removed as well, but I wouldn't count that it can't ever happen in race conditions, so fixing it makes sense. BTW - the same logic (removing the pgupmap entries should be applied when changing the number of pgs per pool, see tracker issue #67265)

Actions #9

Updated by Josh Salomon over 1 year ago

  • Related to Bug #67265: Make sure *pgupmap* entries are removed when changing number of pgs per pool added
Actions #10

Updated by Laura Flores over 1 year ago

  • Status changed from New to In Progress
  • Pull request ID set to 59331
Actions #11

Updated by Laura Flores about 1 year ago

  • Related to Bug #69760: Monitors crash largely due to the structure of pg-upmap-primary added
Actions #12

Updated by Laura Flores about 1 year ago ยท Edited

Manual testing of the fix:

# Existing pools

$ ./bin/ceph osd lspools
1 rbd
2 .mgr
3 cephfs.a.meta
4 cephfs.a.data
5 foo

# Applied pg-upmap-primary mappings for pools 4 and 5:

$ ./bin/ceph osd dump
epoch 63
fsid 853a8bf4-c46f-459d-8657-bd6f7371d106
created 2025-03-09T22:56:57.060547+0000
modified 2025-03-09T22:59:15.939769+0000
flags sortbitwise,recovery_deletes,purged_snapdirs,pglog_hardlimit
crush_version 8
full_ratio 0.99
backfillfull_ratio 0.99
nearfull_ratio 0.99
require_min_compat_client reef
min_compat_client reef
require_osd_release tentacle
stretch_mode_enabled false
pool 1 'rbd' replicated size 3 min_size 1 crush_rule 0 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 31 lfor 0/0/27 flags hashpspool stripe_width 0 application rbd read_balance_score 1.63
pool 2 '.mgr' replicated size 3 min_size 1 crush_rule 0 object_hash rjenkins pg_num 1 pgp_num 1 autoscale_mode on last_change 16 flags hashpspool stripe_width 0 pg_num_max 32 pg_num_min 1 application mgr read_balance_score 4.00
pool 3 'cephfs.a.meta' replicated size 3 min_size 1 crush_rule 0 object_hash rjenkins pg_num 16 pgp_num 16 autoscale_mode on last_change 62 lfor 0/0/27 flags hashpspool stripe_width 0 pg_autoscale_bias 4 pg_num_min 16 recovery_priority 5 application cephfs read_balance_score 1.75
pool 4 'cephfs.a.data' replicated size 3 min_size 1 crush_rule 0 object_hash rjenkins pg_num 128 pgp_num 128 autoscale_mode on last_change 31 lfor 0/0/29 flags hashpspool,bulk stripe_width 0 application cephfs read_balance_score 1.16
pool 5 'foo' replicated size 3 min_size 1 crush_rule 0 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 31 lfor 0/0/29 flags hashpspool stripe_width 0 read_balance_score 1.50
max_osd 4
osd.0 up   in  weight 1 up_from 8 up_thru 62 down_at 0 last_clean_interval [0,0) [v2:127.0.0.1:6802/5398261,v1:127.0.0.1:6803/5398261] [v2:127.0.0.1:6804/5398261,v1:127.0.0.1:6805/5398261] exists,up 20a2f865-80a3-4f4e-9b7d-02f4c01434ac
osd.1 up   in  weight 1 up_from 10 up_thru 31 down_at 0 last_clean_interval [0,0) [v2:127.0.0.1:6810/2721098807,v1:127.0.0.1:6811/2721098807] [v2:127.0.0.1:6812/2721098807,v1:127.0.0.1:6813/2721098807] exists,up b6d51662-6ecc-4598-a7f4-71a4f695fdb9
osd.2 up   in  weight 1 up_from 12 up_thru 60 down_at 0 last_clean_interval [0,0) [v2:127.0.0.1:6818/238372648,v1:127.0.0.1:6819/238372648] [v2:127.0.0.1:6820/238372648,v1:127.0.0.1:6821/238372648] exists,up d0919a5d-d439-40b2-b3fb-9251da6fac44
osd.3 up   in  weight 1 up_from 13 up_thru 58 down_at 0 last_clean_interval [0,0) [v2:127.0.0.1:6826/4144818599,v1:127.0.0.1:6827/4144818599] [v2:127.0.0.1:6828/4144818599,v1:127.0.0.1:6829/4144818599] exists,up f9103c11-d0fd-419e-8610-722f7db02144
pg_upmap_primary 4.4 2
pg_upmap_primary 5.1 2
pg_upmap_primary 5.2 2
pg_upmap_primary 5.4 2

# Deleted pool "foo" 

$ ./bin/ceph osd pool rm foo foo --yes-i-really-really-mean-it
pool 'foo' removed
$ ./bin/ceph osd lspools
1 rbd
2 .mgr
3 cephfs.a.meta
4 cephfs.a.data

# Mappings still exist for deleted pool "foo" (5) due to https://tracker.ceph.com/issues/66867. This is to simulate a user having a mix of valid mappings (for pool 4, which still exists) and invalid mappings (for pool 5, which no longer exists):

$ ./bin/ceph osd dump
*** DEVELOPER MODE: setting PATH, PYTHONPATH and LD_LIBRARY_PATH ***
2025-03-09T23:03:44.823+0000 7fcfcdbd0640 -1 WARNING: all dangerous and experimental features are enabled.
2025-03-09T23:03:44.832+0000 7fcfcdbd0640 -1 WARNING: all dangerous and experimental features are enabled.
epoch 66
fsid 853a8bf4-c46f-459d-8657-bd6f7371d106
created 2025-03-09T22:56:57.060547+0000
modified 2025-03-09T23:00:37.215183+0000
flags sortbitwise,recovery_deletes,purged_snapdirs,pglog_hardlimit
crush_version 8
full_ratio 0.99
backfillfull_ratio 0.99
nearfull_ratio 0.99
require_min_compat_client reef
min_compat_client reef
require_osd_release tentacle
stretch_mode_enabled false
pool 1 'rbd' replicated size 3 min_size 1 crush_rule 0 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 31 lfor 0/0/27 flags hashpspool stripe_width 0 application rbd read_balance_score 1.63
pool 2 '.mgr' replicated size 3 min_size 1 crush_rule 0 object_hash rjenkins pg_num 1 pgp_num 1 autoscale_mode on last_change 16 flags hashpspool stripe_width 0 pg_num_max 32 pg_num_min 1 application mgr read_balance_score 4.00
pool 3 'cephfs.a.meta' replicated size 3 min_size 1 crush_rule 0 object_hash rjenkins pg_num 16 pgp_num 16 autoscale_mode on last_change 62 lfor 0/0/27 flags hashpspool stripe_width 0 pg_autoscale_bias 4 pg_num_min 16 recovery_priority 5 application cephfs read_balance_score 1.75
pool 4 'cephfs.a.data' replicated size 3 min_size 1 crush_rule 0 object_hash rjenkins pg_num 128 pgp_num 128 autoscale_mode on last_change 31 lfor 0/0/29 flags hashpspool,bulk stripe_width 0 application cephfs read_balance_score 1.16
max_osd 4
osd.0 up   in  weight 1 up_from 8 up_thru 64 down_at 0 last_clean_interval [0,0) [v2:127.0.0.1:6802/5398261,v1:127.0.0.1:6803/5398261] [v2:127.0.0.1:6804/5398261,v1:127.0.0.1:6805/5398261] exists,up 20a2f865-80a3-4f4e-9b7d-02f4c01434ac
osd.1 up   in  weight 1 up_from 10 up_thru 31 down_at 0 last_clean_interval [0,0) [v2:127.0.0.1:6810/2721098807,v1:127.0.0.1:6811/2721098807] [v2:127.0.0.1:6812/2721098807,v1:127.0.0.1:6813/2721098807] exists,up b6d51662-6ecc-4598-a7f4-71a4f695fdb9
osd.2 up   in  weight 1 up_from 12 up_thru 64 down_at 0 last_clean_interval [0,0) [v2:127.0.0.1:6818/238372648,v1:127.0.0.1:6819/238372648] [v2:127.0.0.1:6820/238372648,v1:127.0.0.1:6821/238372648] exists,up d0919a5d-d439-40b2-b3fb-9251da6fac44
osd.3 up   in  weight 1 up_from 13 up_thru 58 down_at 0 last_clean_interval [0,0) [v2:127.0.0.1:6826/4144818599,v1:127.0.0.1:6827/4144818599] [v2:127.0.0.1:6828/4144818599,v1:127.0.0.1:6829/4144818599] exists,up f9103c11-d0fd-419e-8610-722f7db02144
pg_upmap_items 4.b [3,2]
pg_upmap_items 4.14 [3,2]
pg_upmap_items 4.18 [3,0]
pg_upmap_items 4.30 [3,2]
pg_upmap_items 4.45 [3,2]
pg_upmap_items 4.63 [3,0]
pg_upmap_primary 4.4 2
pg_upmap_primary 5.1 2
pg_upmap_primary 5.2 2
pg_upmap_primary 5.4 2

Try removing with normal command (first valid, then invalid). Valid mapping is cleared, but invalid mapping is impossible to remove:

$ ./bin/ceph osd rm-pg-upmap-primary 4.4
clear 4.4 pg_upmap_primary mapping
$ ./bin/ceph osd rm-pg-upmap-primary 5.4
Error ENOENT: pgid '5.4' does not exist

Now apply the new rm-pg-upmap-primary-all as a fail-safe to remove all mappings, valid and invalid:

$ ./bin/ceph osd rm-pg-upmap-primary-all
cleared all pg_upmap_primary mappings
$ ./bin/ceph osd dump
epoch 69
fsid 853a8bf4-c46f-459d-8657-bd6f7371d106
created 2025-03-09T22:56:57.060547+0000
modified 2025-03-09T23:07:29.346415+0000
flags sortbitwise,recovery_deletes,purged_snapdirs,pglog_hardlimit
crush_version 8
full_ratio 0.99
backfillfull_ratio 0.99
nearfull_ratio 0.99
require_min_compat_client reef
min_compat_client luminous
require_osd_release tentacle
stretch_mode_enabled false
pool 1 'rbd' replicated size 3 min_size 1 crush_rule 0 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 31 lfor 0/0/27 flags hashpspool stripe_width 0 application rbd read_balance_score 1.63
pool 2 '.mgr' replicated size 3 min_size 1 crush_rule 0 object_hash rjenkins pg_num 1 pgp_num 1 autoscale_mode on last_change 16 flags hashpspool stripe_width 0 pg_num_max 32 pg_num_min 1 application mgr read_balance_score 4.00
pool 3 'cephfs.a.meta' replicated size 3 min_size 1 crush_rule 0 object_hash rjenkins pg_num 16 pgp_num 16 autoscale_mode on last_change 62 lfor 0/0/27 flags hashpspool stripe_width 0 pg_autoscale_bias 4 pg_num_min 16 recovery_priority 5 application cephfs read_balance_score 1.75
pool 4 'cephfs.a.data' replicated size 3 min_size 1 crush_rule 0 object_hash rjenkins pg_num 128 pgp_num 128 autoscale_mode on last_change 31 lfor 0/0/29 flags hashpspool,bulk stripe_width 0 application cephfs read_balance_score 1.13
max_osd 4
osd.0 up   in  weight 1 up_from 8 up_thru 64 down_at 0 last_clean_interval [0,0) [v2:127.0.0.1:6802/5398261,v1:127.0.0.1:6803/5398261] [v2:127.0.0.1:6804/5398261,v1:127.0.0.1:6805/5398261] exists,up 20a2f865-80a3-4f4e-9b7d-02f4c01434ac
osd.1 up   in  weight 1 up_from 10 up_thru 67 down_at 0 last_clean_interval [0,0) [v2:127.0.0.1:6810/2721098807,v1:127.0.0.1:6811/2721098807] [v2:127.0.0.1:6812/2721098807,v1:127.0.0.1:6813/2721098807] exists,up b6d51662-6ecc-4598-a7f4-71a4f695fdb9
osd.2 up   in  weight 1 up_from 12 up_thru 64 down_at 0 last_clean_interval [0,0) [v2:127.0.0.1:6818/238372648,v1:127.0.0.1:6819/238372648] [v2:127.0.0.1:6820/238372648,v1:127.0.0.1:6821/238372648] exists,up d0919a5d-d439-40b2-b3fb-9251da6fac44
osd.3 up   in  weight 1 up_from 13 up_thru 58 down_at 0 last_clean_interval [0,0) [v2:127.0.0.1:6826/4144818599,v1:127.0.0.1:6827/4144818599] [v2:127.0.0.1:6828/4144818599,v1:127.0.0.1:6829/4144818599] exists,up f9103c11-d0fd-419e-8610-722f7db02144
pg_upmap_items 4.b [3,2]
pg_upmap_items 4.14 [3,2]
pg_upmap_items 4.18 [3,0]
pg_upmap_items 4.30 [3,2]
pg_upmap_items 4.45 [3,2]
pg_upmap_items 4.63 [3,0]

# All pg_upmap_primary mappings (valid and invalid) are now cleared.
Actions #13

Updated by Laura Flores about 1 year ago

  • Status changed from In Progress to Fix Under Review
  • Pull request ID changed from 59331 to 62190
Actions #14

Updated by Laura Flores about 1 year ago

PR is under testing...

Actions #15

Updated by Laura Flores about 1 year ago

  • Status changed from Fix Under Review to Pending Backport
Actions #16

Updated by Upkeep Bot about 1 year ago

  • Copied to Backport #70591: reef: Make rm-pg-upmap-primary able to remove mappings by force added
Actions #17

Updated by Upkeep Bot about 1 year ago

  • Copied to Backport #70592: squid: Make rm-pg-upmap-primary able to remove mappings by force added
Actions #18

Updated by Upkeep Bot about 1 year ago

  • Tags (freeform) set to backport_processed
Actions #19

Updated by Laura Flores about 1 year ago

  • Related to deleted (Bug #69760: Monitors crash largely due to the structure of pg-upmap-primary)
Actions #20

Updated by Laura Flores about 1 year ago

  • Has duplicate Bug #69760: Monitors crash largely due to the structure of pg-upmap-primary added
Actions #21

Updated by Laura Flores about 1 year ago

Users affected by [1] (a bug in which some pg_upmap_primary mappings are unable to be removed after pool deletion) may use the new command, `ceph osd rm-pg-upmap-primary-all`, to remove all pg-upmap-primary mappings from the osdmap.

Note that both valid and invalid pg-upmap-primary mappings will be removed, which is acceptable since there should be no data movement involved, and it is better algorithmically to start with fresh mappings. After running the command, the user may then rerun the read balancer manually if on Reef or Squid [2], or let the balancing happen automatically via the mgr module if on Squid [3].

[1] https://tracker.ceph.com/issues/66867
[2] https://docs.ceph.com/en/reef/rados/operations/read-balancer/#offline-optimization
[3] https://docs.ceph.com/en/squid/rados/operations/read-balancer/#online-optimization

Actions #22

Updated by Laura Flores 12 months ago

  • Status changed from Pending Backport to Resolved
Actions #23

Updated by Upkeep Bot 8 months ago

  • Merge Commit set to 8d229a9ebab1f9c32eb248f2fba8964fc64676c1
  • Fixed In set to v20.0.0-557-g8d229a9ebab
  • Upkeep Timestamp set to 2025-07-10T17:40:52+00:00
Actions #24

Updated by Upkeep Bot 8 months ago

  • Fixed In changed from v20.0.0-557-g8d229a9ebab to v20.0.0-557-g8d229a9eba
  • Upkeep Timestamp changed from 2025-07-10T17:40:52+00:00 to 2025-07-14T22:41:52+00:00
Actions #25

Updated by Upkeep Bot 5 months ago

  • Released In set to v20.2.0~833
  • Upkeep Timestamp changed from 2025-07-14T22:41:52+00:00 to 2025-11-01T01:32:43+00:00
Actions

Also available in: Atom PDF