osd/scrub: use separate chunk size configuration for shallow scrubs by ronen-fr · Pull Request #44749 · ceph/ceph

ronen-fr · 2022-01-24T09:00:52Z

Using the existing common default chunk size for scrubs that are
not deep-scrubs is wasteful: a high ratio of inter-OSD messages
per chunk, while the actual OSD work per chunk is minimal.

Signed-off-by: Ronen Friedman rfriedma@redhat.com

jdurgin · 2022-01-24T14:59:35Z

seems like a good idea, would like to see perf tests to ensure the new defaults for shallow scrub don't impact availability

github-actions · 2022-05-02T13:28:48Z

This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved

Using the existing common default chunk size for scrubs that are not deep scrubs is wasteful: a high ratio of inter-OSD messages per chunk, while the actual OSD work per chunk is minimal. Signed-off-by: Ronen Friedman <rfriedma@redhat.com>

mkogan1 · 2023-02-14T17:11:09Z

scrub performance data collected for various osd_shallow_scrub_chunk_[min/max] = [5/25] (old default) VS [25/100] (the values in this PR) and beyond that [50/200]
while running client write load in parallel with the scrub that was generated by S3 PUT objects:
nice numactl -N 1 -m 1 -- ~/go/bin/hsbench ... -z 4K -d -1 -t $(( $(numactl -N 0 -- nproc) / 4 )) -b $(( $(numactl -N 0 -- nproc) * 2 )) -n 1000000 -m p ...
to 3 OSDs in replicat 3 with 8 PGs in the RGW data pool

Please note the LAST_SCRUB_DURATION column

for I in {0..2}; do echo ${I} ; sudo ./bin/ceph daemon osd.${I} config set osd_max_scrubs 1 ; done
for I in {0..2}; do echo ${I} ; sudo ./bin/ceph daemon osd.${I} config set osd_shallow_scrub_chunk_min 5 ; done
for I in {0..2}; do echo ${I} ; sudo ./bin/ceph daemon osd.${I} config set osd_shallow_scrub_chunk_max 25 ; done
for I in {0..2}; do echo ${I} ; sudo ./bin/ceph daemon "osd.${I}" config show | egrep 'osd_shallow_scrub_chunk_min|osd_shallow_scrub_chunk_max|osd_max_scrubs' ; done

sudo ./bin/ceph osd pool scrub default.rgw.buckets.data

watch -cd "sudo ./bin/ceph pg dump pgs | grep -v '0  periodic'"
PG_STAT  OBJECTS  MISSING_ON_PRIMARY  DEGRADED  MISPLACED  UNFOUND  BYTES      OMAP_BYTES*  OMAP_KEYS*  LOG    LOG_DUPS  DISK_LOG  STATE         STATE_STAMP                      VERSION     REPORTED    UP       UP_
PRIMARY  ACTING   ACTING_PRIMARY  LAST_SCRUB  SCRUB_STAMP                      LAST_DEEP_SCRUB  DEEP_SCRUB_STAMP                 SNAPTRIMQ_LEN  LAST_SCRUB_DURATION  SCRUB_SCHEDULING
           OBJECTS_SCRUBBED  OBJECTS_TRIMMED
6.0       125365                   0         0          0        0  513495040            0           0  30003         0     30003  active+clean  2023-02-14T16:46:55.267651+0000   27'538993  27:1026991  [0,2,1]
      0  [0,2,1]               0   27'493413  2023-02-14T16:46:55.267583+0000              0'0  2023-02-14T13:13:04.993756+0000              0                   72  periodic scrub scheduled @ 2023-02-16T03:51:50.45
1771+0000            125365                0
6.1       124927                   0         0          0        0  511700992            0           0  30008         0     30008  active+clean  2023-02-14T16:50:25.612933+0000   27'537368   27:994455  [1,2,0]
      1  [1,2,0]               1   27'503229  2023-02-14T16:50:25.612862+0000              0'0  2023-02-14T13:13:04.993756+0000              0                   75  periodic scrub scheduled @ 2023-02-15T23:00:51.98
2660+0000            124927                0
6.6       125070                   0         0          0        0  512286720            0           0  30004         0     30004  active+clean  2023-02-14T16:58:46.068086+0000   27'540554  27:1003206  [1,2,0]
      1  [1,2,0]               1   27'539805  2023-02-14T16:58:46.068021+0000              0'0  2023-02-14T13:13:04.993756+0000              0                  103  periodic scrub scheduled @ 2023-02-15T17:21:04.84
1086+0000            125070                0
6.7       125219                   0         0          0        0  512897024            0           0  30007         0     30007  active+clean  2023-02-14T16:53:26.453498+0000   27'538497  27:1013508  [0,2,1]
      0  [0,2,1]               0   27'516880  2023-02-14T16:53:26.453427+0000              0'0  2023-02-14T13:13:04.993756+0000              0                   83  periodic scrub scheduled @ 2023-02-16T00:40:36.36
0688+0000            125219                0
6.5       124938                   0         0          0        0  511746048            0           0  30009         0     30009  active+clean  2023-02-14T16:52:02.514919+0000   27'538279  27:1010699  [0,1,2]
      0  [0,1,2]               0   27'509720  2023-02-14T16:52:02.514849+0000              0'0  2023-02-14T13:13:04.993756+0000              0                   97  periodic scrub scheduled @ 2023-02-16T02:16:28.49
3837+0000            124938                0
6.4       124438                   0         0          0        0  509698048            0           0  30007         0     30007  active+clean  2023-02-14T16:56:52.242870+0000   27'537137   27:994171  [1,0,2]
      1  [1,0,2]               1   27'527794  2023-02-14T16:56:52.242793+0000              0'0  2023-02-14T13:13:04.993756+0000              0                  202  periodic scrub scheduled @ 2023-02-16T00:20:05.66
2431+0000            124438                0
6.3       124877                   0         0          0        0  511496192            0           0  30007         0     30007  active+clean  2023-02-14T16:49:10.313375+0000   27'536767  27:1007671  [0,2,1]
      0  [0,2,1]               0   27'498762  2023-02-14T16:49:10.313298+0000              0'0  2023-02-14T13:13:04.993756+0000              0                   61  periodic scrub scheduled @ 2023-02-16T00:11:47.50
0867+0000            124877                0
6.2       125166                   0         0          0        0  512679936            0           0  30004         0     30004  active+clean  2023-02-14T16:48:07.342298+0000   27'539134   27:994854  [1,0,2]
      1  [1,0,2]               1   27'497952  2023-02-14T16:48:07.342206+0000              0'0  20dumped pgs


for I in {0..2}; do echo ${I} ; sudo ./bin/ceph daemon osd.${I} config set osd_max_scrubs 1 ; done
for I in {0..2}; do echo ${I} ; sudo ./bin/ceph daemon osd.${I} config set osd_shallow_scrub_chunk_min 50 ; done
for I in {0..2}; do echo ${I} ; sudo ./bin/ceph daemon osd.${I} config set osd_shallow_scrub_chunk_max 100 ; done
for I in {0..2}; do echo ${I} ; sudo ./bin/ceph daemon "osd.${I}" config show | egrep 'osd_shallow_scrub_chunk_min|osd_shallow_scrub_chunk_max|osd_max_scrubs' ; done

sudo ./bin/ceph osd pool scrub default.rgw.buckets.data

watch -cd "sudo ./bin/ceph pg dump pgs | grep -v '0  periodic'"
PG_STAT  OBJECTS  MISSING_ON_PRIMARY  DEGRADED  MISPLACED  UNFOUND  BYTES      OMAP_BYTES*  OMAP_KEYS*  LOG    LOG_DUPS  DISK_LOG  STATE         STATE_STAMP                      VERSION     REPORTED    UP       UP_
PRIMARY  ACTING   ACTING_PRIMARY  LAST_SCRUB  SCRUB_STAMP                      LAST_DEEP_SCRUB  DEEP_SCRUB_STAMP                 SNAPTRIMQ_LEN  LAST_SCRUB_DURATION  SCRUB_SCHEDULING
           OBJECTS_SCRUBBED  OBJECTS_TRIMMED
6.0       125365                   0         0          0        0  513495040            0           0  30005         0     30005  active+clean  2023-02-14T16:34:40.144857+0000   27'462785   27:863246  [0,2,1]
      0  [0,2,1]               0   27'455430  2023-02-14T16:34:40.144836+0000              0'0  2023-02-14T13:13:04.993756+0000              0                   36  periodic scrub scheduled @ 2023-02-16T00:01:44.88
4484+0000            125365                0
6.1       124927                   0         0          0        0  511700992            0           0  30007         0     30007  active+clean  2023-02-14T16:33:19.758199+0000   27'461367   27:831154  [1,2,0]
      1  [1,2,0]               1   27'445687  2023-02-14T16:33:19.758176+0000              0'0  2023-02-14T13:13:04.993756+0000              0                   23  periodic scrub scheduled @ 2023-02-15T22:59:26.10
8009+0000            124927                0
6.6       125070                   0         0          0        0  512286720            0           0  30009         0     30009  active+clean  2023-02-14T16:34:03.734047+0000   27'463709   27:838202  [1,2,0]
      1  [1,2,0]               1   27'452546  2023-02-14T16:34:03.734019+0000              0'0  2023-02-14T13:13:04.993756+0000              0                   25  periodic scrub scheduled @ 2023-02-15T23:35:06.34
5727+0000            125070                0
6.7       125219                   0         0          0        0  512897024            0           0  30003         0     30003  active+clean  2023-02-14T16:36:30.051385+0000   27'462373   27:849942  [0,2,1]
      0  [0,2,1]               0   27'461153  2023-02-14T16:36:30.051361+0000              0'0  2023-02-14T13:13:04.993756+0000              0                   39  periodic scrub scheduled @ 2023-02-15T22:28:14.76
0414+0000            125219                0
6.5       124938                   0         0          0        0  511746048            0           0  30010         0     30010  active+clean  2023-02-14T16:35:50.424144+0000   27'461890   27:846621  [0,1,2]
      0  [0,1,2]               0   27'458594  2023-02-14T16:35:50.424108+0000              0'0  2023-02-14T13:13:04.993756+0000              0                   37  periodic scrub scheduled @ 2023-02-15T21:51:49.89
0139+0000            124938                0
6.4       124438                   0         0          0        0  509698048            0           0  30005         0     30005  active+clean  2023-02-14T16:33:38.960958+0000   27'460775   27:830188  [1,0,2]
      1  [1,0,2]               1   27'446657  2023-02-14T16:33:38.960932+0000              0'0  2023-02-14T13:13:04.993756+0000              0                   19  periodic scrub scheduled @ 2023-02-16T04:02:35.12
2801+0000            124438                0
6.3       124877                   0         0          0        0  511496192            0           0  30010         0     30010  active+clean  2023-02-14T16:35:11.979721+0000   27'460780   27:844413  [0,2,1]
      0  [0,2,1]               0   27'455406  2023-02-14T16:35:11.979695+0000              0'0  2023-02-14T13:13:04.993756+0000              0                   28  periodic scrub scheduled @ 2023-02-16T01:18:22.95
9876+0000            124877                0
6.2       125166                   0         0          0        0  512679936            0           0  30006         0     30006  active+clean  2023-02-14T16:32:55.096158+0000   27'462876   27:831022  [1,0,2]
      1  [1,0,2]               1   27'444630  2023-02-14T16:32:55.096134+0000              0'0  20dumped pgs


for I in {0..2}; do echo ${I} ; sudo ./bin/ceph daemon osd.${I} config set osd_max_scrubs 2 ; done
for I in {0..2}; do echo ${I} ; sudo ./bin/ceph daemon osd.${I} config set osd_shallow_scrub_chunk_min 50 ; done
for I in {0..2}; do echo ${I} ; sudo ./bin/ceph daemon osd.${I} config set osd_shallow_scrub_chunk_max 200 ; done
for I in {0..2}; do echo ${I} ; sudo ./bin/ceph daemon "osd.${I}" config show | egrep 'osd_shallow_scrub_chunk_min|osd_shallow_scrub_chunk_max|osd_max_scrubs' ; done

sudo ./bin/ceph osd pool scrub default.rgw.buckets.data

watch -cd "sudo ./bin/ceph pg dump pgs | grep -v '0  periodic'"
PG_STAT  OBJECTS  MISSING_ON_PRIMARY  DEGRADED  MISPLACED  UNFOUND  BYTES      OMAP_BYTES*  OMAP_KEYS*  LOG    LOG_DUPS  DISK_LOG  STATE         STATE_STAMP                      VERSION     REPORTED    UP       UP_
PRIMARY  ACTING   ACTING_PRIMARY  LAST_SCRUB  SCRUB_STAMP                      LAST_DEEP_SCRUB  DEEP_SCRUB_STAMP                 SNAPTRIMQ_LEN  LAST_SCRUB_DURATION  SCRUB_SCHEDULING
           OBJECTS_SCRUBBED  OBJECTS_TRIMMED
6.0       125365                   0         0          0        0  513495040            0           0  30003         0     30003  active+clean  2023-02-14T16:39:55.061761+0000   27'476613   27:892174  [0,2,1]
      0  [0,2,1]               0   27'468311  2023-02-14T16:39:55.061732+0000              0'0  2023-02-14T13:13:04.993756+0000              0                   32  periodic scrub scheduled @ 2023-02-15T22:35:14.94
3024+0000            125365                0
6.1       124927                   0         0          0        0  511700992            0           0  30005         0     30005  active+clean  2023-02-14T16:40:40.438066+0000   27'475275   27:860248  [1,2,0]
      1  [1,2,0]               1   27'469087  2023-02-14T16:40:40.438050+0000              0'0  2023-02-14T13:13:04.993756+0000              0                   45  periodic scrub scheduled @ 2023-02-15T19:21:57.05
1531+0000            124927                0
6.6       125070                   0         0          0        0  512286720            0           0  30008         0     30008  active+clean  2023-02-14T16:41:31.324353+0000   27'477728   27:867514  [1,2,0]
      1  [1,2,0]               1   27'474980  2023-02-14T16:41:31.324324+0000              0'0  2023-02-14T13:13:04.993756+0000              0                   39  periodic scrub scheduled @ 2023-02-15T21:01:37.54
1493+0000            125070                0
6.7       125219                   0         0          0        0  512897024            0           0  30009         0     30009  active+clean  2023-02-14T16:41:24.675289+0000   27'476249   27:878966  [0,2,1]
      0  [0,2,1]               0   27'472970  2023-02-14T16:41:24.675265+0000              0'0  2023-02-14T13:13:04.993756+0000              0                   22  periodic scrub scheduled @ 2023-02-16T02:27:38.45
3356+0000            125219                0
6.5       124938                   0         0          0        0  511746048            0           0  30010         0     30010  active+clean  2023-02-14T16:41:02.556168+0000   27'475860   27:875836  [0,1,2]
      0  [0,1,2]               0   27'471055  2023-02-14T16:41:02.556140+0000              0'0  2023-02-14T13:13:04.993756+0000              0                   21  periodic scrub scheduled @ 2023-02-16T01:45:56.50
7568+0000            124938                0
6.4       124438                   0         0          0        0  509698048            0           0  30004         0     30004  active+clean  2023-02-14T16:40:31.790887+0000   27'474764   27:859432  [1,0,2]
      1  [1,0,2]               1   27'468105  2023-02-14T16:40:31.790861+0000              0'0  2023-02-14T13:13:04.993756+0000              0                   32  periodic scrub scheduled @ 2023-02-15T17:11:17.80
9012+0000            124438                0
6.3       124877                   0         0          0        0  511496192            0           0  30007         0     30007  active+clean  2023-02-14T16:40:52.861420+0000   27'474517   27:873153  [0,2,1]
      0  [0,2,1]               0   27'469302  2023-02-14T16:40:52.861395+0000              0'0  2023-02-14T13:13:04.993756+0000              0                   21  periodic scrub scheduled @ 2023-02-16T02:10:42.99
1814+0000            124877                0
6.2       125166                   0         0          0        0  512679936            0           0  30009         0     30009  active+clean  2023-02-14T16:39:58.157495+0000   27'476999   27:860540  [1,0,2]
      1  [1,0,2]               1   27'468836  2023-02-14T16:39:58.157469+0000              0'0  20dumped pgs

ronen-fr · 2023-02-14T17:54:28Z

Note that Mark has also tested how increasing the shallow scrubs chunk size affected the scrubs duration on an unloaded cluster (w/o client load). Scrubs were faster by about 2.5X.

There was no adverse effect on client IOPs - it seemed as though the new scheduler did a great job in prioritizing the clients work.

On a loaded system - using the suggested default values for the shallow chunks (100/50): shallow scrubs took half the time they took when using the existing shared chunk size.

athanatos

Nice!

ronen-fr · 2023-02-15T10:06:37Z

Merging based on my Teuthology runs. All failures were verified to be a result of the 'publish-stats' bug.

github-actions bot added common core dashboard pybind labels Jan 24, 2022

ronen-fr requested review from jdurgin and neha-ojha January 24, 2022 09:01

ronen-fr added the performance label Jan 25, 2022

github-actions bot added the stale label Mar 27, 2022

ronen-fr removed the stale label Mar 27, 2022

github-actions bot added the needs-rebase label May 2, 2022

ronen-fr force-pushed the wip-rf-large-chunk branch from b31c28a to 3e92896 Compare May 4, 2022 11:21

github-actions bot removed the needs-rebase label May 4, 2022

djgalloway changed the base branch from master to main May 25, 2022 20:04

github-actions bot added the stale label Aug 6, 2022

ronen-fr removed the stale label Aug 7, 2022

github-actions bot added the stale label Jan 12, 2023

ronen-fr removed the stale label Jan 12, 2023

ronen-fr force-pushed the wip-rf-large-chunk branch 2 times, most recently from 6ece222 to 47c78d0 Compare February 13, 2023 19:57

ronen-fr force-pushed the wip-rf-large-chunk branch from 47c78d0 to ffda641 Compare February 14, 2023 05:58

ronen-fr changed the title ~~proposed: osd/scrub: create a separate chunk size conf for shallow scrubs~~ osd/scrub: use separate chunk size configuration for shallow scrubs Feb 14, 2023

ronen-fr marked this pull request as ready for review February 14, 2023 17:57

ronen-fr requested review from a team as code owners February 14, 2023 17:57

ronen-fr removed the request for review from a team February 14, 2023 17:57

ronen-fr requested review from Pegonzal and avanthakkar February 14, 2023 17:57

ronen-fr removed pybind dashboard labels Feb 14, 2023

ceph deleted a comment from github-actions bot Feb 14, 2023

athanatos approved these changes Feb 14, 2023

View reviewed changes

ronen-fr merged commit 4cc70b7 into ceph:main Feb 15, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

osd/scrub: use separate chunk size configuration for shallow scrubs#44749

osd/scrub: use separate chunk size configuration for shallow scrubs#44749
ronen-fr merged 1 commit intoceph:mainfrom
ronen-fr:wip-rf-large-chunk

ronen-fr commented Jan 24, 2022

Uh oh!

jdurgin commented Jan 24, 2022

Uh oh!

github-actions bot commented May 2, 2022

Uh oh!

mkogan1 commented Feb 14, 2023

Uh oh!

ronen-fr commented Feb 14, 2023 •

edited

Loading

Uh oh!

athanatos left a comment

Uh oh!

ronen-fr commented Feb 15, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

ronen-fr commented Jan 24, 2022

Uh oh!

jdurgin commented Jan 24, 2022

Uh oh!

github-actions bot commented May 2, 2022

Uh oh!

mkogan1 commented Feb 14, 2023

Uh oh!

ronen-fr commented Feb 14, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

athanatos left a comment

Choose a reason for hiding this comment

Uh oh!

ronen-fr commented Feb 15, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

ronen-fr commented Feb 14, 2023 •

edited

Loading