Skip to content

osd/scrub: use separate chunk size configuration for shallow scrubs#44749

Merged
ronen-fr merged 1 commit intoceph:mainfrom
ronen-fr:wip-rf-large-chunk
Feb 15, 2023
Merged

osd/scrub: use separate chunk size configuration for shallow scrubs#44749
ronen-fr merged 1 commit intoceph:mainfrom
ronen-fr:wip-rf-large-chunk

Conversation

@ronen-fr
Copy link
Contributor

Using the existing common default chunk size for scrubs that are
not deep-scrubs is wasteful: a high ratio of inter-OSD messages
per chunk, while the actual OSD work per chunk is minimal.

Signed-off-by: Ronen Friedman rfriedma@redhat.com

@jdurgin
Copy link
Member

jdurgin commented Jan 24, 2022

seems like a good idea, would like to see perf tests to ensure the new defaults for shallow scrub don't impact availability

@github-actions
Copy link

github-actions bot commented May 2, 2022

This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved

@ronen-fr ronen-fr force-pushed the wip-rf-large-chunk branch from b31c28a to 3e92896 Compare May 4, 2022 11:21
@djgalloway djgalloway changed the base branch from master to main May 25, 2022 20:04
@github-actions github-actions bot added the stale label Aug 6, 2022
@ronen-fr ronen-fr removed the stale label Aug 7, 2022
@github-actions github-actions bot added the stale label Jan 12, 2023
@ronen-fr ronen-fr removed the stale label Jan 12, 2023
@ronen-fr ronen-fr force-pushed the wip-rf-large-chunk branch 2 times, most recently from 6ece222 to 47c78d0 Compare February 13, 2023 19:57
Using the existing common default chunk size for scrubs that are
not deep scrubs is wasteful: a high ratio of inter-OSD messages
per chunk, while the actual OSD work per chunk is minimal.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
@mkogan1
Copy link
Contributor

mkogan1 commented Feb 14, 2023

scrub performance data collected for various osd_shallow_scrub_chunk_[min/max] = [5/25] (old default) VS [25/100] (the values in this PR) and beyond that [50/200]
while running client write load in parallel with the scrub that was generated by S3 PUT objects:
nice numactl -N 1 -m 1 -- ~/go/bin/hsbench ... -z 4K -d -1 -t $(( $(numactl -N 0 -- nproc) / 4 )) -b $(( $(numactl -N 0 -- nproc) * 2 )) -n 1000000 -m p ...
to 3 OSDs in replicat 3 with 8 PGs in the RGW data pool

Please note the LAST_SCRUB_DURATION column

for I in {0..2}; do echo ${I} ; sudo ./bin/ceph daemon osd.${I} config set osd_max_scrubs 1 ; done
for I in {0..2}; do echo ${I} ; sudo ./bin/ceph daemon osd.${I} config set osd_shallow_scrub_chunk_min 5 ; done
for I in {0..2}; do echo ${I} ; sudo ./bin/ceph daemon osd.${I} config set osd_shallow_scrub_chunk_max 25 ; done
for I in {0..2}; do echo ${I} ; sudo ./bin/ceph daemon "osd.${I}" config show | egrep 'osd_shallow_scrub_chunk_min|osd_shallow_scrub_chunk_max|osd_max_scrubs' ; done

sudo ./bin/ceph osd pool scrub default.rgw.buckets.data

watch -cd "sudo ./bin/ceph pg dump pgs | grep -v '0  periodic'"
PG_STAT  OBJECTS  MISSING_ON_PRIMARY  DEGRADED  MISPLACED  UNFOUND  BYTES      OMAP_BYTES*  OMAP_KEYS*  LOG    LOG_DUPS  DISK_LOG  STATE         STATE_STAMP                      VERSION     REPORTED    UP       UP_
PRIMARY  ACTING   ACTING_PRIMARY  LAST_SCRUB  SCRUB_STAMP                      LAST_DEEP_SCRUB  DEEP_SCRUB_STAMP                 SNAPTRIMQ_LEN  LAST_SCRUB_DURATION  SCRUB_SCHEDULING
           OBJECTS_SCRUBBED  OBJECTS_TRIMMED
6.0       125365                   0         0          0        0  513495040            0           0  30003         0     30003  active+clean  2023-02-14T16:46:55.267651+0000   27'538993  27:1026991  [0,2,1]
      0  [0,2,1]               0   27'493413  2023-02-14T16:46:55.267583+0000              0'0  2023-02-14T13:13:04.993756+0000              0                   72  periodic scrub scheduled @ 2023-02-16T03:51:50.45
1771+0000            125365                0
6.1       124927                   0         0          0        0  511700992            0           0  30008         0     30008  active+clean  2023-02-14T16:50:25.612933+0000   27'537368   27:994455  [1,2,0]
      1  [1,2,0]               1   27'503229  2023-02-14T16:50:25.612862+0000              0'0  2023-02-14T13:13:04.993756+0000              0                   75  periodic scrub scheduled @ 2023-02-15T23:00:51.98
2660+0000            124927                0
6.6       125070                   0         0          0        0  512286720            0           0  30004         0     30004  active+clean  2023-02-14T16:58:46.068086+0000   27'540554  27:1003206  [1,2,0]
      1  [1,2,0]               1   27'539805  2023-02-14T16:58:46.068021+0000              0'0  2023-02-14T13:13:04.993756+0000              0                  103  periodic scrub scheduled @ 2023-02-15T17:21:04.84
1086+0000            125070                0
6.7       125219                   0         0          0        0  512897024            0           0  30007         0     30007  active+clean  2023-02-14T16:53:26.453498+0000   27'538497  27:1013508  [0,2,1]
      0  [0,2,1]               0   27'516880  2023-02-14T16:53:26.453427+0000              0'0  2023-02-14T13:13:04.993756+0000              0                   83  periodic scrub scheduled @ 2023-02-16T00:40:36.36
0688+0000            125219                0
6.5       124938                   0         0          0        0  511746048            0           0  30009         0     30009  active+clean  2023-02-14T16:52:02.514919+0000   27'538279  27:1010699  [0,1,2]
      0  [0,1,2]               0   27'509720  2023-02-14T16:52:02.514849+0000              0'0  2023-02-14T13:13:04.993756+0000              0                   97  periodic scrub scheduled @ 2023-02-16T02:16:28.49
3837+0000            124938                0
6.4       124438                   0         0          0        0  509698048            0           0  30007         0     30007  active+clean  2023-02-14T16:56:52.242870+0000   27'537137   27:994171  [1,0,2]
      1  [1,0,2]               1   27'527794  2023-02-14T16:56:52.242793+0000              0'0  2023-02-14T13:13:04.993756+0000              0                  202  periodic scrub scheduled @ 2023-02-16T00:20:05.66
2431+0000            124438                0
6.3       124877                   0         0          0        0  511496192            0           0  30007         0     30007  active+clean  2023-02-14T16:49:10.313375+0000   27'536767  27:1007671  [0,2,1]
      0  [0,2,1]               0   27'498762  2023-02-14T16:49:10.313298+0000              0'0  2023-02-14T13:13:04.993756+0000              0                   61  periodic scrub scheduled @ 2023-02-16T00:11:47.50
0867+0000            124877                0
6.2       125166                   0         0          0        0  512679936            0           0  30004         0     30004  active+clean  2023-02-14T16:48:07.342298+0000   27'539134   27:994854  [1,0,2]
      1  [1,0,2]               1   27'497952  2023-02-14T16:48:07.342206+0000              0'0  20dumped pgs


for I in {0..2}; do echo ${I} ; sudo ./bin/ceph daemon osd.${I} config set osd_max_scrubs 1 ; done
for I in {0..2}; do echo ${I} ; sudo ./bin/ceph daemon osd.${I} config set osd_shallow_scrub_chunk_min 50 ; done
for I in {0..2}; do echo ${I} ; sudo ./bin/ceph daemon osd.${I} config set osd_shallow_scrub_chunk_max 100 ; done
for I in {0..2}; do echo ${I} ; sudo ./bin/ceph daemon "osd.${I}" config show | egrep 'osd_shallow_scrub_chunk_min|osd_shallow_scrub_chunk_max|osd_max_scrubs' ; done

sudo ./bin/ceph osd pool scrub default.rgw.buckets.data

watch -cd "sudo ./bin/ceph pg dump pgs | grep -v '0  periodic'"
PG_STAT  OBJECTS  MISSING_ON_PRIMARY  DEGRADED  MISPLACED  UNFOUND  BYTES      OMAP_BYTES*  OMAP_KEYS*  LOG    LOG_DUPS  DISK_LOG  STATE         STATE_STAMP                      VERSION     REPORTED    UP       UP_
PRIMARY  ACTING   ACTING_PRIMARY  LAST_SCRUB  SCRUB_STAMP                      LAST_DEEP_SCRUB  DEEP_SCRUB_STAMP                 SNAPTRIMQ_LEN  LAST_SCRUB_DURATION  SCRUB_SCHEDULING
           OBJECTS_SCRUBBED  OBJECTS_TRIMMED
6.0       125365                   0         0          0        0  513495040            0           0  30005         0     30005  active+clean  2023-02-14T16:34:40.144857+0000   27'462785   27:863246  [0,2,1]
      0  [0,2,1]               0   27'455430  2023-02-14T16:34:40.144836+0000              0'0  2023-02-14T13:13:04.993756+0000              0                   36  periodic scrub scheduled @ 2023-02-16T00:01:44.88
4484+0000            125365                0
6.1       124927                   0         0          0        0  511700992            0           0  30007         0     30007  active+clean  2023-02-14T16:33:19.758199+0000   27'461367   27:831154  [1,2,0]
      1  [1,2,0]               1   27'445687  2023-02-14T16:33:19.758176+0000              0'0  2023-02-14T13:13:04.993756+0000              0                   23  periodic scrub scheduled @ 2023-02-15T22:59:26.10
8009+0000            124927                0
6.6       125070                   0         0          0        0  512286720            0           0  30009         0     30009  active+clean  2023-02-14T16:34:03.734047+0000   27'463709   27:838202  [1,2,0]
      1  [1,2,0]               1   27'452546  2023-02-14T16:34:03.734019+0000              0'0  2023-02-14T13:13:04.993756+0000              0                   25  periodic scrub scheduled @ 2023-02-15T23:35:06.34
5727+0000            125070                0
6.7       125219                   0         0          0        0  512897024            0           0  30003         0     30003  active+clean  2023-02-14T16:36:30.051385+0000   27'462373   27:849942  [0,2,1]
      0  [0,2,1]               0   27'461153  2023-02-14T16:36:30.051361+0000              0'0  2023-02-14T13:13:04.993756+0000              0                   39  periodic scrub scheduled @ 2023-02-15T22:28:14.76
0414+0000            125219                0
6.5       124938                   0         0          0        0  511746048            0           0  30010         0     30010  active+clean  2023-02-14T16:35:50.424144+0000   27'461890   27:846621  [0,1,2]
      0  [0,1,2]               0   27'458594  2023-02-14T16:35:50.424108+0000              0'0  2023-02-14T13:13:04.993756+0000              0                   37  periodic scrub scheduled @ 2023-02-15T21:51:49.89
0139+0000            124938                0
6.4       124438                   0         0          0        0  509698048            0           0  30005         0     30005  active+clean  2023-02-14T16:33:38.960958+0000   27'460775   27:830188  [1,0,2]
      1  [1,0,2]               1   27'446657  2023-02-14T16:33:38.960932+0000              0'0  2023-02-14T13:13:04.993756+0000              0                   19  periodic scrub scheduled @ 2023-02-16T04:02:35.12
2801+0000            124438                0
6.3       124877                   0         0          0        0  511496192            0           0  30010         0     30010  active+clean  2023-02-14T16:35:11.979721+0000   27'460780   27:844413  [0,2,1]
      0  [0,2,1]               0   27'455406  2023-02-14T16:35:11.979695+0000              0'0  2023-02-14T13:13:04.993756+0000              0                   28  periodic scrub scheduled @ 2023-02-16T01:18:22.95
9876+0000            124877                0
6.2       125166                   0         0          0        0  512679936            0           0  30006         0     30006  active+clean  2023-02-14T16:32:55.096158+0000   27'462876   27:831022  [1,0,2]
      1  [1,0,2]               1   27'444630  2023-02-14T16:32:55.096134+0000              0'0  20dumped pgs


for I in {0..2}; do echo ${I} ; sudo ./bin/ceph daemon osd.${I} config set osd_max_scrubs 2 ; done
for I in {0..2}; do echo ${I} ; sudo ./bin/ceph daemon osd.${I} config set osd_shallow_scrub_chunk_min 50 ; done
for I in {0..2}; do echo ${I} ; sudo ./bin/ceph daemon osd.${I} config set osd_shallow_scrub_chunk_max 200 ; done
for I in {0..2}; do echo ${I} ; sudo ./bin/ceph daemon "osd.${I}" config show | egrep 'osd_shallow_scrub_chunk_min|osd_shallow_scrub_chunk_max|osd_max_scrubs' ; done

sudo ./bin/ceph osd pool scrub default.rgw.buckets.data

watch -cd "sudo ./bin/ceph pg dump pgs | grep -v '0  periodic'"
PG_STAT  OBJECTS  MISSING_ON_PRIMARY  DEGRADED  MISPLACED  UNFOUND  BYTES      OMAP_BYTES*  OMAP_KEYS*  LOG    LOG_DUPS  DISK_LOG  STATE         STATE_STAMP                      VERSION     REPORTED    UP       UP_
PRIMARY  ACTING   ACTING_PRIMARY  LAST_SCRUB  SCRUB_STAMP                      LAST_DEEP_SCRUB  DEEP_SCRUB_STAMP                 SNAPTRIMQ_LEN  LAST_SCRUB_DURATION  SCRUB_SCHEDULING
           OBJECTS_SCRUBBED  OBJECTS_TRIMMED
6.0       125365                   0         0          0        0  513495040            0           0  30003         0     30003  active+clean  2023-02-14T16:39:55.061761+0000   27'476613   27:892174  [0,2,1]
      0  [0,2,1]               0   27'468311  2023-02-14T16:39:55.061732+0000              0'0  2023-02-14T13:13:04.993756+0000              0                   32  periodic scrub scheduled @ 2023-02-15T22:35:14.94
3024+0000            125365                0
6.1       124927                   0         0          0        0  511700992            0           0  30005         0     30005  active+clean  2023-02-14T16:40:40.438066+0000   27'475275   27:860248  [1,2,0]
      1  [1,2,0]               1   27'469087  2023-02-14T16:40:40.438050+0000              0'0  2023-02-14T13:13:04.993756+0000              0                   45  periodic scrub scheduled @ 2023-02-15T19:21:57.05
1531+0000            124927                0
6.6       125070                   0         0          0        0  512286720            0           0  30008         0     30008  active+clean  2023-02-14T16:41:31.324353+0000   27'477728   27:867514  [1,2,0]
      1  [1,2,0]               1   27'474980  2023-02-14T16:41:31.324324+0000              0'0  2023-02-14T13:13:04.993756+0000              0                   39  periodic scrub scheduled @ 2023-02-15T21:01:37.54
1493+0000            125070                0
6.7       125219                   0         0          0        0  512897024            0           0  30009         0     30009  active+clean  2023-02-14T16:41:24.675289+0000   27'476249   27:878966  [0,2,1]
      0  [0,2,1]               0   27'472970  2023-02-14T16:41:24.675265+0000              0'0  2023-02-14T13:13:04.993756+0000              0                   22  periodic scrub scheduled @ 2023-02-16T02:27:38.45
3356+0000            125219                0
6.5       124938                   0         0          0        0  511746048            0           0  30010         0     30010  active+clean  2023-02-14T16:41:02.556168+0000   27'475860   27:875836  [0,1,2]
      0  [0,1,2]               0   27'471055  2023-02-14T16:41:02.556140+0000              0'0  2023-02-14T13:13:04.993756+0000              0                   21  periodic scrub scheduled @ 2023-02-16T01:45:56.50
7568+0000            124938                0
6.4       124438                   0         0          0        0  509698048            0           0  30004         0     30004  active+clean  2023-02-14T16:40:31.790887+0000   27'474764   27:859432  [1,0,2]
      1  [1,0,2]               1   27'468105  2023-02-14T16:40:31.790861+0000              0'0  2023-02-14T13:13:04.993756+0000              0                   32  periodic scrub scheduled @ 2023-02-15T17:11:17.80
9012+0000            124438                0
6.3       124877                   0         0          0        0  511496192            0           0  30007         0     30007  active+clean  2023-02-14T16:40:52.861420+0000   27'474517   27:873153  [0,2,1]
      0  [0,2,1]               0   27'469302  2023-02-14T16:40:52.861395+0000              0'0  2023-02-14T13:13:04.993756+0000              0                   21  periodic scrub scheduled @ 2023-02-16T02:10:42.99
1814+0000            124877                0
6.2       125166                   0         0          0        0  512679936            0           0  30009         0     30009  active+clean  2023-02-14T16:39:58.157495+0000   27'476999   27:860540  [1,0,2]
      1  [1,0,2]               1   27'468836  2023-02-14T16:39:58.157469+0000              0'0  20dumped pgs

@ronen-fr
Copy link
Contributor Author

ronen-fr commented Feb 14, 2023

Note that Mark has also tested how increasing the shallow scrubs chunk size affected the scrubs duration on an unloaded cluster (w/o client load). Scrubs were faster by about 2.5X.

There was no adverse effect on client IOPs - it seemed as though the new scheduler did a great job in prioritizing the clients work.

On a loaded system - using the suggested default values for the shallow chunks (100/50): shallow scrubs took half the time they took when using the existing shared chunk size.

@ronen-fr ronen-fr changed the title proposed: osd/scrub: create a separate chunk size conf for shallow scrubs osd/scrub: use separate chunk size configuration for shallow scrubs Feb 14, 2023
@ronen-fr ronen-fr marked this pull request as ready for review February 14, 2023 17:57
@ronen-fr ronen-fr requested review from a team as code owners February 14, 2023 17:57
@ronen-fr ronen-fr removed the request for review from a team February 14, 2023 17:57
@ceph ceph deleted a comment from github-actions bot Feb 14, 2023
@ceph ceph deleted a comment from github-actions bot Feb 14, 2023
@ceph ceph deleted a comment from github-actions bot Feb 14, 2023
Copy link
Contributor

@athanatos athanatos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@ronen-fr
Copy link
Contributor Author

Merging based on my Teuthology runs. All failures were verified to be a result of the 'publish-stats' bug.

@ronen-fr ronen-fr merged commit 4cc70b7 into ceph:main Feb 15, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

5 participants