Bug #67179: Make rm-pg-upmap-primary able to remove mappings by force - RADOS - Ceph

Actions

Copy link

Bug #67179

closed

Make rm-pg-upmap-primary able to remove mappings by force

Added by Laura Flores over 1 year ago. Updated 5 months ago.

Status:

Resolved

Priority:

High

Assignee:

Laura Flores

Category:

Target version:

% Done:

Source:

Backport:

reef,squid

Regression:

Severity:

3 - minor

Reviewed:

Affected Versions:

ceph-qa-suite:

Component(RADOS):

Pull request ID:

62190

Tags (freeform):

backport_processed

Merge Commit:

8d229a9ebab1f9c32eb248f2fba8964fc64676c1

Fixed In:

v20.0.0-557-g8d229a9eba

Released In:

v20.2.0~833

Upkeep Timestamp:

2025-11-01T01:32:43+00:00

Description

Even though it should not be possible for invalid pg-upmap-primary mappings to be in the osdmap, users may need to remove mappings for pgs that no longer exist due to the bug tracked in https://tracker.ceph.com/issues/66867. This can be achieved by adding a "--force" flag to the existing CLI command.

Files

osd.map (28.3 KB) osd.map

Reily Siegel, 07/29/2024 09:15 PM

Related issues 5 (1 open — 4 closed)

Actions

Copy link

Updated by Laura Flores over 1 year ago

Tracker changed from Enhancement to Bug
Regression set to No
Severity set to 3 - minor

Actions

Copy link

Updated by Laura Flores over 1 year ago

Priority changed from Normal to High

Actions

Copy link

Updated by Laura Flores over 1 year ago

Related to Bug #66867: pg_upmap_primary items are retained in OSD map for a pool which is already deleted added

Actions

Copy link

Updated by Laura Flores over 1 year ago

Assignee set to Laura Flores

Actions

Copy link

Updated by Laura Flores over 1 year ago

Backport set to reef,squid

Actions

Copy link

Updated by Radoslaw Zarzynski over 1 year ago

Do we have similar thing for upmap-items & co?

Actions

Copy link

Updated by Reily Siegel over 1 year ago

File osd.map osd.map added

I've encountered this issue on my cluster, version 18.2.1. As requested in slack, here is my OSD map.

Actions

Copy link

Updated by Josh Salomon over 1 year ago

Technically this is a very simple fix - the rm-pgupmap-* commands should not check that the pg exists during the command parsing since they make the check later to see if the pg exists in the pgupmap maps. The problem should not happen since during pool removal the pg-upmap-* records for the pools are removed as well, but I wouldn't count that it can't ever happen in race conditions, so fixing it makes sense. BTW - the same logic (removing the pgupmap entries should be applied when changing the number of pgs per pool, see tracker issue #67265)

Actions

Copy link

Updated by Josh Salomon over 1 year ago

Related to Bug #67265: Make sure *pgupmap* entries are removed when changing number of pgs per pool added

Actions

Copy link

#10

Updated by Laura Flores over 1 year ago

Status changed from New to In Progress
Pull request ID set to 59331

Actions

Copy link

#11

Updated by Laura Flores about 1 year ago

Related to Bug #69760: Monitors crash largely due to the structure of pg-upmap-primary added

Actions

Copy link

#12

Updated by Laura Flores about 1 year ago · Edited

Manual testing of the fix:

# Existing pools

$ ./bin/ceph osd lspools
1 rbd
2 .mgr
3 cephfs.a.meta
4 cephfs.a.data
5 foo

# Applied pg-upmap-primary mappings for pools 4 and 5:

$ ./bin/ceph osd dump
epoch 63
fsid 853a8bf4-c46f-459d-8657-bd6f7371d106
created 2025-03-09T22:56:57.060547+0000
modified 2025-03-09T22:59:15.939769+0000
flags sortbitwise,recovery_deletes,purged_snapdirs,pglog_hardlimit
crush_version 8
full_ratio 0.99
backfillfull_ratio 0.99
nearfull_ratio 0.99
require_min_compat_client reef
min_compat_client reef
require_osd_release tentacle
stretch_mode_enabled false
pool 1 'rbd' replicated size 3 min_size 1 crush_rule 0 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 31 lfor 0/0/27 flags hashpspool stripe_width 0 application rbd read_balance_score 1.63
pool 2 '.mgr' replicated size 3 min_size 1 crush_rule 0 object_hash rjenkins pg_num 1 pgp_num 1 autoscale_mode on last_change 16 flags hashpspool stripe_width 0 pg_num_max 32 pg_num_min 1 application mgr read_balance_score 4.00
pool 3 'cephfs.a.meta' replicated size 3 min_size 1 crush_rule 0 object_hash rjenkins pg_num 16 pgp_num 16 autoscale_mode on last_change 62 lfor 0/0/27 flags hashpspool stripe_width 0 pg_autoscale_bias 4 pg_num_min 16 recovery_priority 5 application cephfs read_balance_score 1.75
pool 4 'cephfs.a.data' replicated size 3 min_size 1 crush_rule 0 object_hash rjenkins pg_num 128 pgp_num 128 autoscale_mode on last_change 31 lfor 0/0/29 flags hashpspool,bulk stripe_width 0 application cephfs read_balance_score 1.16
pool 5 'foo' replicated size 3 min_size 1 crush_rule 0 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 31 lfor 0/0/29 flags hashpspool stripe_width 0 read_balance_score 1.50
max_osd 4
osd.0 up   in  weight 1 up_from 8 up_thru 62 down_at 0 last_clean_interval [0,0) [v2:127.0.0.1:6802/5398261,v1:127.0.0.1:6803/5398261] [v2:127.0.0.1:6804/5398261,v1:127.0.0.1:6805/5398261] exists,up 20a2f865-80a3-4f4e-9b7d-02f4c01434ac
osd.1 up   in  weight 1 up_from 10 up_thru 31 down_at 0 last_clean_interval [0,0) [v2:127.0.0.1:6810/2721098807,v1:127.0.0.1:6811/2721098807] [v2:127.0.0.1:6812/2721098807,v1:127.0.0.1:6813/2721098807] exists,up b6d51662-6ecc-4598-a7f4-71a4f695fdb9
osd.2 up   in  weight 1 up_from 12 up_thru 60 down_at 0 last_clean_interval [0,0) [v2:127.0.0.1:6818/238372648,v1:127.0.0.1:6819/238372648] [v2:127.0.0.1:6820/238372648,v1:127.0.0.1:6821/238372648] exists,up d0919a5d-d439-40b2-b3fb-9251da6fac44
osd.3 up   in  weight 1 up_from 13 up_thru 58 down_at 0 last_clean_interval [0,0) [v2:127.0.0.1:6826/4144818599,v1:127.0.0.1:6827/4144818599] [v2:127.0.0.1:6828/4144818599,v1:127.0.0.1:6829/4144818599] exists,up f9103c11-d0fd-419e-8610-722f7db02144
pg_upmap_primary 4.4 2
pg_upmap_primary 5.1 2
pg_upmap_primary 5.2 2
pg_upmap_primary 5.4 2

# Deleted pool "foo" 

$ ./bin/ceph osd pool rm foo foo --yes-i-really-really-mean-it
pool 'foo' removed
$ ./bin/ceph osd lspools
1 rbd
2 .mgr
3 cephfs.a.meta
4 cephfs.a.data

# Mappings still exist for deleted pool "foo" (5) due to https://tracker.ceph.com/issues/66867. This is to simulate a user having a mix of valid mappings (for pool 4, which still exists) and invalid mappings (for pool 5, which no longer exists):

$ ./bin/ceph osd dump
*** DEVELOPER MODE: setting PATH, PYTHONPATH and LD_LIBRARY_PATH ***
2025-03-09T23:03:44.823+0000 7fcfcdbd0640 -1 WARNING: all dangerous and experimental features are enabled.
2025-03-09T23:03:44.832+0000 7fcfcdbd0640 -1 WARNING: all dangerous and experimental features are enabled.
epoch 66
fsid 853a8bf4-c46f-459d-8657-bd6f7371d106
created 2025-03-09T22:56:57.060547+0000
modified 2025-03-09T23:00:37.215183+0000
flags sortbitwise,recovery_deletes,purged_snapdirs,pglog_hardlimit
crush_version 8
full_ratio 0.99
backfillfull_ratio 0.99
nearfull_ratio 0.99
require_min_compat_client reef
min_compat_client reef
require_osd_release tentacle
stretch_mode_enabled false
pool 1 'rbd' replicated size 3 min_size 1 crush_rule 0 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 31 lfor 0/0/27 flags hashpspool stripe_width 0 application rbd read_balance_score 1.63
pool 2 '.mgr' replicated size 3 min_size 1 crush_rule 0 object_hash rjenkins pg_num 1 pgp_num 1 autoscale_mode on last_change 16 flags hashpspool stripe_width 0 pg_num_max 32 pg_num_min 1 application mgr read_balance_score 4.00
pool 3 'cephfs.a.meta' replicated size 3 min_size 1 crush_rule 0 object_hash rjenkins pg_num 16 pgp_num 16 autoscale_mode on last_change 62 lfor 0/0/27 flags hashpspool stripe_width 0 pg_autoscale_bias 4 pg_num_min 16 recovery_priority 5 application cephfs read_balance_score 1.75
pool 4 'cephfs.a.data' replicated size 3 min_size 1 crush_rule 0 object_hash rjenkins pg_num 128 pgp_num 128 autoscale_mode on last_change 31 lfor 0/0/29 flags hashpspool,bulk stripe_width 0 application cephfs read_balance_score 1.16
max_osd 4
osd.0 up   in  weight 1 up_from 8 up_thru 64 down_at 0 last_clean_interval [0,0) [v2:127.0.0.1:6802/5398261,v1:127.0.0.1:6803/5398261] [v2:127.0.0.1:6804/5398261,v1:127.0.0.1:6805/5398261] exists,up 20a2f865-80a3-4f4e-9b7d-02f4c01434ac
osd.1 up   in  weight 1 up_from 10 up_thru 31 down_at 0 last_clean_interval [0,0) [v2:127.0.0.1:6810/2721098807,v1:127.0.0.1:6811/2721098807] [v2:127.0.0.1:6812/2721098807,v1:127.0.0.1:6813/2721098807] exists,up b6d51662-6ecc-4598-a7f4-71a4f695fdb9
osd.2 up   in  weight 1 up_from 12 up_thru 64 down_at 0 last_clean_interval [0,0) [v2:127.0.0.1:6818/238372648,v1:127.0.0.1:6819/238372648] [v2:127.0.0.1:6820/238372648,v1:127.0.0.1:6821/238372648] exists,up d0919a5d-d439-40b2-b3fb-9251da6fac44
osd.3 up   in  weight 1 up_from 13 up_thru 58 down_at 0 last_clean_interval [0,0) [v2:127.0.0.1:6826/4144818599,v1:127.0.0.1:6827/4144818599] [v2:127.0.0.1:6828/4144818599,v1:127.0.0.1:6829/4144818599] exists,up f9103c11-d0fd-419e-8610-722f7db02144
pg_upmap_items 4.b [3,2]
pg_upmap_items 4.14 [3,2]
pg_upmap_items 4.18 [3,0]
pg_upmap_items 4.30 [3,2]
pg_upmap_items 4.45 [3,2]
pg_upmap_items 4.63 [3,0]
pg_upmap_primary 4.4 2
pg_upmap_primary 5.1 2
pg_upmap_primary 5.2 2
pg_upmap_primary 5.4 2

Try removing with normal command (first valid, then invalid). Valid mapping is cleared, but invalid mapping is impossible to remove:

$ ./bin/ceph osd rm-pg-upmap-primary 4.4
clear 4.4 pg_upmap_primary mapping
$ ./bin/ceph osd rm-pg-upmap-primary 5.4
Error ENOENT: pgid '5.4' does not exist

Now apply the new rm-pg-upmap-primary-all as a fail-safe to remove all mappings, valid and invalid:

$ ./bin/ceph osd rm-pg-upmap-primary-all
cleared all pg_upmap_primary mappings
$ ./bin/ceph osd dump
epoch 69
fsid 853a8bf4-c46f-459d-8657-bd6f7371d106
created 2025-03-09T22:56:57.060547+0000
modified 2025-03-09T23:07:29.346415+0000
flags sortbitwise,recovery_deletes,purged_snapdirs,pglog_hardlimit
crush_version 8
full_ratio 0.99
backfillfull_ratio 0.99
nearfull_ratio 0.99
require_min_compat_client reef
min_compat_client luminous
require_osd_release tentacle
stretch_mode_enabled false
pool 1 'rbd' replicated size 3 min_size 1 crush_rule 0 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 31 lfor 0/0/27 flags hashpspool stripe_width 0 application rbd read_balance_score 1.63
pool 2 '.mgr' replicated size 3 min_size 1 crush_rule 0 object_hash rjenkins pg_num 1 pgp_num 1 autoscale_mode on last_change 16 flags hashpspool stripe_width 0 pg_num_max 32 pg_num_min 1 application mgr read_balance_score 4.00
pool 3 'cephfs.a.meta' replicated size 3 min_size 1 crush_rule 0 object_hash rjenkins pg_num 16 pgp_num 16 autoscale_mode on last_change 62 lfor 0/0/27 flags hashpspool stripe_width 0 pg_autoscale_bias 4 pg_num_min 16 recovery_priority 5 application cephfs read_balance_score 1.75
pool 4 'cephfs.a.data' replicated size 3 min_size 1 crush_rule 0 object_hash rjenkins pg_num 128 pgp_num 128 autoscale_mode on last_change 31 lfor 0/0/29 flags hashpspool,bulk stripe_width 0 application cephfs read_balance_score 1.13
max_osd 4
osd.0 up   in  weight 1 up_from 8 up_thru 64 down_at 0 last_clean_interval [0,0) [v2:127.0.0.1:6802/5398261,v1:127.0.0.1:6803/5398261] [v2:127.0.0.1:6804/5398261,v1:127.0.0.1:6805/5398261] exists,up 20a2f865-80a3-4f4e-9b7d-02f4c01434ac
osd.1 up   in  weight 1 up_from 10 up_thru 67 down_at 0 last_clean_interval [0,0) [v2:127.0.0.1:6810/2721098807,v1:127.0.0.1:6811/2721098807] [v2:127.0.0.1:6812/2721098807,v1:127.0.0.1:6813/2721098807] exists,up b6d51662-6ecc-4598-a7f4-71a4f695fdb9
osd.2 up   in  weight 1 up_from 12 up_thru 64 down_at 0 last_clean_interval [0,0) [v2:127.0.0.1:6818/238372648,v1:127.0.0.1:6819/238372648] [v2:127.0.0.1:6820/238372648,v1:127.0.0.1:6821/238372648] exists,up d0919a5d-d439-40b2-b3fb-9251da6fac44
osd.3 up   in  weight 1 up_from 13 up_thru 58 down_at 0 last_clean_interval [0,0) [v2:127.0.0.1:6826/4144818599,v1:127.0.0.1:6827/4144818599] [v2:127.0.0.1:6828/4144818599,v1:127.0.0.1:6829/4144818599] exists,up f9103c11-d0fd-419e-8610-722f7db02144
pg_upmap_items 4.b [3,2]
pg_upmap_items 4.14 [3,2]
pg_upmap_items 4.18 [3,0]
pg_upmap_items 4.30 [3,2]
pg_upmap_items 4.45 [3,2]
pg_upmap_items 4.63 [3,0]

# All pg_upmap_primary mappings (valid and invalid) are now cleared.

Actions

Copy link

#13

Updated by Laura Flores about 1 year ago

Status changed from In Progress to Fix Under Review
Pull request ID changed from 59331 to 62190

Actions

Copy link

#14

Updated by Laura Flores about 1 year ago

PR is under testing...

Actions

Copy link

#15

Updated by Laura Flores about 1 year ago

Status changed from Fix Under Review to Pending Backport

Actions

Copy link

#16

Updated by Upkeep Bot about 1 year ago

Copied to Backport #70591: reef: Make rm-pg-upmap-primary able to remove mappings by force added

Actions

Copy link

#17

Updated by Upkeep Bot about 1 year ago

Copied to Backport #70592: squid: Make rm-pg-upmap-primary able to remove mappings by force added

Actions

Copy link

#18

Updated by Upkeep Bot about 1 year ago

Tags (freeform) set to backport_processed

Actions

Copy link

#19

Updated by Laura Flores about 1 year ago

Related to deleted (Bug #69760: Monitors crash largely due to the structure of pg-upmap-primary)

Actions

Copy link

#20

Updated by Laura Flores about 1 year ago

Has duplicate Bug #69760: Monitors crash largely due to the structure of pg-upmap-primary added

Actions

Copy link

#21

Updated by Laura Flores about 1 year ago

Users affected by [1] (a bug in which some pg_upmap_primary mappings are unable to be removed after pool deletion) may use the new command, `ceph osd rm-pg-upmap-primary-all`, to remove all pg-upmap-primary mappings from the osdmap.

Note that both valid and invalid pg-upmap-primary mappings will be removed, which is acceptable since there should be no data movement involved, and it is better algorithmically to start with fresh mappings. After running the command, the user may then rerun the read balancer manually if on Reef or Squid [2], or let the balancing happen automatically via the mgr module if on Squid [3].

[1] https://tracker.ceph.com/issues/66867
[2] https://docs.ceph.com/en/reef/rados/operations/read-balancer/#offline-optimization
[3] https://docs.ceph.com/en/squid/rados/operations/read-balancer/#online-optimization

Actions

Copy link

#22

Updated by Laura Flores 12 months ago

Status changed from Pending Backport to Resolved

Actions

Copy link

#23

Updated by Upkeep Bot 8 months ago

Merge Commit set to 8d229a9ebab1f9c32eb248f2fba8964fc64676c1
Fixed In set to v20.0.0-557-g8d229a9ebab
Upkeep Timestamp set to 2025-07-10T17:40:52+00:00

Actions

Copy link

#24

Updated by Upkeep Bot 8 months ago

Fixed In changed from v20.0.0-557-g8d229a9ebab to v20.0.0-557-g8d229a9eba
Upkeep Timestamp changed from 2025-07-10T17:40:52+00:00 to 2025-07-14T22:41:52+00:00

Actions

Copy link

#25

Updated by Upkeep Bot 5 months ago

Released In set to v20.2.0~833
Upkeep Timestamp changed from 2025-07-14T22:41:52+00:00 to 2025-11-01T01:32:43+00:00

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

Ceph » RADOS

Tags

Custom queries

Bug #67179

Make rm-pg-upmap-primary able to remove mappings by force

Updated by Laura Flores over 1 year ago

Updated by Laura Flores over 1 year ago

Updated by Laura Flores over 1 year ago

Updated by Laura Flores over 1 year ago

Updated by Laura Flores over 1 year ago

Updated by Radoslaw Zarzynski over 1 year ago

Updated by Reily Siegel over 1 year ago

Updated by Josh Salomon over 1 year ago

Updated by Josh Salomon over 1 year ago

Updated by Laura Flores over 1 year ago

Updated by Laura Flores about 1 year ago

Updated by Laura Flores about 1 year ago · Edited

Updated by Laura Flores about 1 year ago

Updated by Laura Flores about 1 year ago

Updated by Laura Flores about 1 year ago

Updated by Upkeep Bot about 1 year ago

Updated by Upkeep Bot about 1 year ago

Updated by Upkeep Bot about 1 year ago

Updated by Laura Flores about 1 year ago

Updated by Laura Flores about 1 year ago

Updated by Laura Flores about 1 year ago

Updated by Laura Flores 12 months ago

Updated by Upkeep Bot 8 months ago

Updated by Upkeep Bot 8 months ago

Updated by Upkeep Bot 5 months ago