Bug #67179
closedMake rm-pg-upmap-primary able to remove mappings by force
0%
Description
Even though it should not be possible for invalid pg-upmap-primary mappings to be in the osdmap, users may need to remove mappings for pgs that no longer exist due to the bug tracked in https://tracker.ceph.com/issues/66867. This can be achieved by adding a "--force" flag to the existing CLI command.
Files
Updated by Laura Flores over 1 year ago
- Tracker changed from Enhancement to Bug
- Regression set to No
- Severity set to 3 - minor
Updated by Laura Flores over 1 year ago
- Related to Bug #66867: pg_upmap_primary items are retained in OSD map for a pool which is already deleted added
Updated by Radoslaw Zarzynski over 1 year ago
Do we have similar thing for upmap-items & co?
Updated by Reily Siegel over 1 year ago
I've encountered this issue on my cluster, version 18.2.1. As requested in slack, here is my OSD map.
Updated by Josh Salomon over 1 year ago
Technically this is a very simple fix - the rm-pgupmap-* commands should not check that the pg exists during the command parsing since they make the check later to see if the pg exists in the pgupmap maps. The problem should not happen since during pool removal the pg-upmap-* records for the pools are removed as well, but I wouldn't count that it can't ever happen in race conditions, so fixing it makes sense. BTW - the same logic (removing the pgupmap entries should be applied when changing the number of pgs per pool, see tracker issue #67265)
Updated by Josh Salomon over 1 year ago
- Related to Bug #67265: Make sure *pgupmap* entries are removed when changing number of pgs per pool added
Updated by Laura Flores over 1 year ago
- Status changed from New to In Progress
- Pull request ID set to 59331
Updated by Laura Flores about 1 year ago
- Related to Bug #69760: Monitors crash largely due to the structure of pg-upmap-primary added
Updated by Laura Flores about 1 year ago ยท Edited
Manual testing of the fix:
# Existing pools $ ./bin/ceph osd lspools 1 rbd 2 .mgr 3 cephfs.a.meta 4 cephfs.a.data 5 foo # Applied pg-upmap-primary mappings for pools 4 and 5: $ ./bin/ceph osd dump epoch 63 fsid 853a8bf4-c46f-459d-8657-bd6f7371d106 created 2025-03-09T22:56:57.060547+0000 modified 2025-03-09T22:59:15.939769+0000 flags sortbitwise,recovery_deletes,purged_snapdirs,pglog_hardlimit crush_version 8 full_ratio 0.99 backfillfull_ratio 0.99 nearfull_ratio 0.99 require_min_compat_client reef min_compat_client reef require_osd_release tentacle stretch_mode_enabled false pool 1 'rbd' replicated size 3 min_size 1 crush_rule 0 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 31 lfor 0/0/27 flags hashpspool stripe_width 0 application rbd read_balance_score 1.63 pool 2 '.mgr' replicated size 3 min_size 1 crush_rule 0 object_hash rjenkins pg_num 1 pgp_num 1 autoscale_mode on last_change 16 flags hashpspool stripe_width 0 pg_num_max 32 pg_num_min 1 application mgr read_balance_score 4.00 pool 3 'cephfs.a.meta' replicated size 3 min_size 1 crush_rule 0 object_hash rjenkins pg_num 16 pgp_num 16 autoscale_mode on last_change 62 lfor 0/0/27 flags hashpspool stripe_width 0 pg_autoscale_bias 4 pg_num_min 16 recovery_priority 5 application cephfs read_balance_score 1.75 pool 4 'cephfs.a.data' replicated size 3 min_size 1 crush_rule 0 object_hash rjenkins pg_num 128 pgp_num 128 autoscale_mode on last_change 31 lfor 0/0/29 flags hashpspool,bulk stripe_width 0 application cephfs read_balance_score 1.16 pool 5 'foo' replicated size 3 min_size 1 crush_rule 0 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 31 lfor 0/0/29 flags hashpspool stripe_width 0 read_balance_score 1.50 max_osd 4 osd.0 up in weight 1 up_from 8 up_thru 62 down_at 0 last_clean_interval [0,0) [v2:127.0.0.1:6802/5398261,v1:127.0.0.1:6803/5398261] [v2:127.0.0.1:6804/5398261,v1:127.0.0.1:6805/5398261] exists,up 20a2f865-80a3-4f4e-9b7d-02f4c01434ac osd.1 up in weight 1 up_from 10 up_thru 31 down_at 0 last_clean_interval [0,0) [v2:127.0.0.1:6810/2721098807,v1:127.0.0.1:6811/2721098807] [v2:127.0.0.1:6812/2721098807,v1:127.0.0.1:6813/2721098807] exists,up b6d51662-6ecc-4598-a7f4-71a4f695fdb9 osd.2 up in weight 1 up_from 12 up_thru 60 down_at 0 last_clean_interval [0,0) [v2:127.0.0.1:6818/238372648,v1:127.0.0.1:6819/238372648] [v2:127.0.0.1:6820/238372648,v1:127.0.0.1:6821/238372648] exists,up d0919a5d-d439-40b2-b3fb-9251da6fac44 osd.3 up in weight 1 up_from 13 up_thru 58 down_at 0 last_clean_interval [0,0) [v2:127.0.0.1:6826/4144818599,v1:127.0.0.1:6827/4144818599] [v2:127.0.0.1:6828/4144818599,v1:127.0.0.1:6829/4144818599] exists,up f9103c11-d0fd-419e-8610-722f7db02144 pg_upmap_primary 4.4 2 pg_upmap_primary 5.1 2 pg_upmap_primary 5.2 2 pg_upmap_primary 5.4 2 # Deleted pool "foo" $ ./bin/ceph osd pool rm foo foo --yes-i-really-really-mean-it pool 'foo' removed $ ./bin/ceph osd lspools 1 rbd 2 .mgr 3 cephfs.a.meta 4 cephfs.a.data # Mappings still exist for deleted pool "foo" (5) due to https://tracker.ceph.com/issues/66867. This is to simulate a user having a mix of valid mappings (for pool 4, which still exists) and invalid mappings (for pool 5, which no longer exists): $ ./bin/ceph osd dump *** DEVELOPER MODE: setting PATH, PYTHONPATH and LD_LIBRARY_PATH *** 2025-03-09T23:03:44.823+0000 7fcfcdbd0640 -1 WARNING: all dangerous and experimental features are enabled. 2025-03-09T23:03:44.832+0000 7fcfcdbd0640 -1 WARNING: all dangerous and experimental features are enabled. epoch 66 fsid 853a8bf4-c46f-459d-8657-bd6f7371d106 created 2025-03-09T22:56:57.060547+0000 modified 2025-03-09T23:00:37.215183+0000 flags sortbitwise,recovery_deletes,purged_snapdirs,pglog_hardlimit crush_version 8 full_ratio 0.99 backfillfull_ratio 0.99 nearfull_ratio 0.99 require_min_compat_client reef min_compat_client reef require_osd_release tentacle stretch_mode_enabled false pool 1 'rbd' replicated size 3 min_size 1 crush_rule 0 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 31 lfor 0/0/27 flags hashpspool stripe_width 0 application rbd read_balance_score 1.63 pool 2 '.mgr' replicated size 3 min_size 1 crush_rule 0 object_hash rjenkins pg_num 1 pgp_num 1 autoscale_mode on last_change 16 flags hashpspool stripe_width 0 pg_num_max 32 pg_num_min 1 application mgr read_balance_score 4.00 pool 3 'cephfs.a.meta' replicated size 3 min_size 1 crush_rule 0 object_hash rjenkins pg_num 16 pgp_num 16 autoscale_mode on last_change 62 lfor 0/0/27 flags hashpspool stripe_width 0 pg_autoscale_bias 4 pg_num_min 16 recovery_priority 5 application cephfs read_balance_score 1.75 pool 4 'cephfs.a.data' replicated size 3 min_size 1 crush_rule 0 object_hash rjenkins pg_num 128 pgp_num 128 autoscale_mode on last_change 31 lfor 0/0/29 flags hashpspool,bulk stripe_width 0 application cephfs read_balance_score 1.16 max_osd 4 osd.0 up in weight 1 up_from 8 up_thru 64 down_at 0 last_clean_interval [0,0) [v2:127.0.0.1:6802/5398261,v1:127.0.0.1:6803/5398261] [v2:127.0.0.1:6804/5398261,v1:127.0.0.1:6805/5398261] exists,up 20a2f865-80a3-4f4e-9b7d-02f4c01434ac osd.1 up in weight 1 up_from 10 up_thru 31 down_at 0 last_clean_interval [0,0) [v2:127.0.0.1:6810/2721098807,v1:127.0.0.1:6811/2721098807] [v2:127.0.0.1:6812/2721098807,v1:127.0.0.1:6813/2721098807] exists,up b6d51662-6ecc-4598-a7f4-71a4f695fdb9 osd.2 up in weight 1 up_from 12 up_thru 64 down_at 0 last_clean_interval [0,0) [v2:127.0.0.1:6818/238372648,v1:127.0.0.1:6819/238372648] [v2:127.0.0.1:6820/238372648,v1:127.0.0.1:6821/238372648] exists,up d0919a5d-d439-40b2-b3fb-9251da6fac44 osd.3 up in weight 1 up_from 13 up_thru 58 down_at 0 last_clean_interval [0,0) [v2:127.0.0.1:6826/4144818599,v1:127.0.0.1:6827/4144818599] [v2:127.0.0.1:6828/4144818599,v1:127.0.0.1:6829/4144818599] exists,up f9103c11-d0fd-419e-8610-722f7db02144 pg_upmap_items 4.b [3,2] pg_upmap_items 4.14 [3,2] pg_upmap_items 4.18 [3,0] pg_upmap_items 4.30 [3,2] pg_upmap_items 4.45 [3,2] pg_upmap_items 4.63 [3,0] pg_upmap_primary 4.4 2 pg_upmap_primary 5.1 2 pg_upmap_primary 5.2 2 pg_upmap_primary 5.4 2 Try removing with normal command (first valid, then invalid). Valid mapping is cleared, but invalid mapping is impossible to remove: $ ./bin/ceph osd rm-pg-upmap-primary 4.4 clear 4.4 pg_upmap_primary mapping $ ./bin/ceph osd rm-pg-upmap-primary 5.4 Error ENOENT: pgid '5.4' does not exist Now apply the new rm-pg-upmap-primary-all as a fail-safe to remove all mappings, valid and invalid: $ ./bin/ceph osd rm-pg-upmap-primary-all cleared all pg_upmap_primary mappings $ ./bin/ceph osd dump epoch 69 fsid 853a8bf4-c46f-459d-8657-bd6f7371d106 created 2025-03-09T22:56:57.060547+0000 modified 2025-03-09T23:07:29.346415+0000 flags sortbitwise,recovery_deletes,purged_snapdirs,pglog_hardlimit crush_version 8 full_ratio 0.99 backfillfull_ratio 0.99 nearfull_ratio 0.99 require_min_compat_client reef min_compat_client luminous require_osd_release tentacle stretch_mode_enabled false pool 1 'rbd' replicated size 3 min_size 1 crush_rule 0 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 31 lfor 0/0/27 flags hashpspool stripe_width 0 application rbd read_balance_score 1.63 pool 2 '.mgr' replicated size 3 min_size 1 crush_rule 0 object_hash rjenkins pg_num 1 pgp_num 1 autoscale_mode on last_change 16 flags hashpspool stripe_width 0 pg_num_max 32 pg_num_min 1 application mgr read_balance_score 4.00 pool 3 'cephfs.a.meta' replicated size 3 min_size 1 crush_rule 0 object_hash rjenkins pg_num 16 pgp_num 16 autoscale_mode on last_change 62 lfor 0/0/27 flags hashpspool stripe_width 0 pg_autoscale_bias 4 pg_num_min 16 recovery_priority 5 application cephfs read_balance_score 1.75 pool 4 'cephfs.a.data' replicated size 3 min_size 1 crush_rule 0 object_hash rjenkins pg_num 128 pgp_num 128 autoscale_mode on last_change 31 lfor 0/0/29 flags hashpspool,bulk stripe_width 0 application cephfs read_balance_score 1.13 max_osd 4 osd.0 up in weight 1 up_from 8 up_thru 64 down_at 0 last_clean_interval [0,0) [v2:127.0.0.1:6802/5398261,v1:127.0.0.1:6803/5398261] [v2:127.0.0.1:6804/5398261,v1:127.0.0.1:6805/5398261] exists,up 20a2f865-80a3-4f4e-9b7d-02f4c01434ac osd.1 up in weight 1 up_from 10 up_thru 67 down_at 0 last_clean_interval [0,0) [v2:127.0.0.1:6810/2721098807,v1:127.0.0.1:6811/2721098807] [v2:127.0.0.1:6812/2721098807,v1:127.0.0.1:6813/2721098807] exists,up b6d51662-6ecc-4598-a7f4-71a4f695fdb9 osd.2 up in weight 1 up_from 12 up_thru 64 down_at 0 last_clean_interval [0,0) [v2:127.0.0.1:6818/238372648,v1:127.0.0.1:6819/238372648] [v2:127.0.0.1:6820/238372648,v1:127.0.0.1:6821/238372648] exists,up d0919a5d-d439-40b2-b3fb-9251da6fac44 osd.3 up in weight 1 up_from 13 up_thru 58 down_at 0 last_clean_interval [0,0) [v2:127.0.0.1:6826/4144818599,v1:127.0.0.1:6827/4144818599] [v2:127.0.0.1:6828/4144818599,v1:127.0.0.1:6829/4144818599] exists,up f9103c11-d0fd-419e-8610-722f7db02144 pg_upmap_items 4.b [3,2] pg_upmap_items 4.14 [3,2] pg_upmap_items 4.18 [3,0] pg_upmap_items 4.30 [3,2] pg_upmap_items 4.45 [3,2] pg_upmap_items 4.63 [3,0] # All pg_upmap_primary mappings (valid and invalid) are now cleared.
Updated by Laura Flores about 1 year ago
- Status changed from In Progress to Fix Under Review
- Pull request ID changed from 59331 to 62190
Updated by Laura Flores about 1 year ago
- Status changed from Fix Under Review to Pending Backport
Updated by Upkeep Bot about 1 year ago
- Copied to Backport #70591: reef: Make rm-pg-upmap-primary able to remove mappings by force added
Updated by Upkeep Bot about 1 year ago
- Copied to Backport #70592: squid: Make rm-pg-upmap-primary able to remove mappings by force added
Updated by Upkeep Bot about 1 year ago
- Tags (freeform) set to backport_processed
Updated by Laura Flores about 1 year ago
- Related to deleted (Bug #69760: Monitors crash largely due to the structure of pg-upmap-primary)
Updated by Laura Flores about 1 year ago
- Has duplicate Bug #69760: Monitors crash largely due to the structure of pg-upmap-primary added
Updated by Laura Flores about 1 year ago
Users affected by [1] (a bug in which some pg_upmap_primary mappings are unable to be removed after pool deletion) may use the new command, `ceph osd rm-pg-upmap-primary-all`, to remove all pg-upmap-primary mappings from the osdmap.
Note that both valid and invalid pg-upmap-primary mappings will be removed, which is acceptable since there should be no data movement involved, and it is better algorithmically to start with fresh mappings. After running the command, the user may then rerun the read balancer manually if on Reef or Squid [2], or let the balancing happen automatically via the mgr module if on Squid [3].
[1] https://tracker.ceph.com/issues/66867
[2] https://docs.ceph.com/en/reef/rados/operations/read-balancer/#offline-optimization
[3] https://docs.ceph.com/en/squid/rados/operations/read-balancer/#online-optimization
Updated by Laura Flores 12 months ago
- Status changed from Pending Backport to Resolved
Updated by Upkeep Bot 8 months ago
- Merge Commit set to 8d229a9ebab1f9c32eb248f2fba8964fc64676c1
- Fixed In set to v20.0.0-557-g8d229a9ebab
- Upkeep Timestamp set to 2025-07-10T17:40:52+00:00
Updated by Upkeep Bot 8 months ago
- Fixed In changed from v20.0.0-557-g8d229a9ebab to v20.0.0-557-g8d229a9eba
- Upkeep Timestamp changed from 2025-07-10T17:40:52+00:00 to 2025-07-14T22:41:52+00:00
Updated by Upkeep Bot 5 months ago
- Released In set to v20.2.0~833
- Upkeep Timestamp changed from 2025-07-14T22:41:52+00:00 to 2025-11-01T01:32:43+00:00