pacific: tools: ceph-objectstore-tool is able to trim solely pg log dups' entries#46631
Conversation
This reverts commit d49ff13. which is the in-OSD part of the fix for accumulation of `dup` entries in a PG Log. Brainstorming it has brought questions on the OSD's behaviour during an upgrade if there are tons of dups in the log. What must be double-checked before bringing it back is ensuring we chunk the deletions properly to not impose OOMs / stalls in, to exemplify, RocksDB. The backport ticket is: https://tracker.ceph.com/issues/55989 Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
This reverts commit 1f3fede. Although the chunking in off-line `dups` trimming (via COT) seems fine, the `ceph-objectstore-tool` is a client of `trim()` of `PGLog::IndexedLog` which means than a partial revert is not possible without extensive changes. Moreover, trimming pg log is not enough without modifying pg_info_t accordingly which the reverted patch lacks. The backport ticket is: https://tracker.ceph.com/issues/55989 Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
The main assumption is trimming just dups doesn't need any update
to the corresponding pg_info_t.
Testing:
1. cluster without the autoscaler
```
rzarz@ubulap:~/dev/ceph/build$ MON=1 MGR=1 OSD=3 MGR=1 MDS=0 ../src/vstart.sh -l -b -n -o "osd_pg_log_dups_tracked=3000000" -o "osd_pool_default_pg_autoscale_mode=off"
```
2. 8 PGs in the testing pool.
```
rzarz@ubulap:~/dev/ceph/build$ bin/ceph osd pool create test-pool 8 8
```
3. Provisioning dups with rados bench
```
bin/rados bench -p test-pool 300 write -b 4096 --no-cleanup
...
Total time run: 300.034
Total writes made: 103413
Write size: 4096
Object size: 4096
Bandwidth (MB/sec): 1.34637
Stddev Bandwidth: 0.589071
Max bandwidth (MB/sec): 2.4375
Min bandwidth (MB/sec): 0.902344
Average IOPS: 344
Stddev IOPS: 150.802
Max IOPS: 624
Min IOPS: 231
Average Latency(s): 0.0464151
Stddev Latency(s): 0.0183627
Max latency(s): 0.0928424
Min latency(s): 0.0131932
```
4. Killing osd.0
```
rzarz@ubulap:~/dev/ceph/build$ kill 2572129 # pid of osd.0
```
5. Listing PGs on osd.0 and calculating number of pg log's entries and
dups:
```
rzarz@ubulap:~/dev/ceph/build$ bin/ceph-objectstore-tool --data-path dev/osd0 --op list-pgs --pgid 2.c > osd0_pgs.txt
rzarz@ubulap:~/dev/ceph/build$ for pgid in `cat osd0_pgs.txt`; do echo $pgid; bin/ceph-objectstore-tool --data-path dev/osd0 --op log --pgid $pgid | jq '(.pg_log_t.log|length),(.pg_log_t.dups|length)'; done
2.7
10020
3100
2.6
10100
3000
2.3
10012
2800
2.1
10049
2900
2.2
10057
2700
2.0
10027
2900
2.5
10077
2700
2.4
10072
2900
1.0
97
0
```
6. Trimming dups
```
rzarz@ubulap:~/dev/ceph/build$ CEPH_ARGS="--osd_pg_log_dups_tracked 2500 --osd_pg_log_trim_max=100" bin/ceph-objectstore-tool --data-path dev/osd0 --op trim-pg-log-dups --pgid 2.7
max_dup_entries=2500 max_chunk_size=100
Removing keys dup_0000000020.00000000000000000001 - dup_0000000020.00000000000000000100
Removing keys dup_0000000020.00000000000000000101 - dup_0000000020.00000000000000000200
Removing keys dup_0000000020.00000000000000000201 - dup_0000000020.00000000000000000300
Removing keys dup_0000000020.00000000000000000301 - dup_0000000020.00000000000000000400
Removing keys dup_0000000020.00000000000000000401 - dup_0000000020.00000000000000000500
Removing keys dup_0000000020.00000000000000000501 - dup_0000000020.00000000000000000600
Finished trimming, now compacting...
Finished trimming pg log dups
```
7. Checking number of pg log's entries and dups
```
rzarz@ubulap:~/dev/ceph/build$ for pgid in `cat osd0_pgs.txt`; do echo $pgid; bin/ceph-objectstore-tool --data-path dev/osd0 --op log --pgid $pgid | jq '(.pg_log_t.log|length),(.pg_log_t.dups|length)'; done
2.7
10020
2500
2.6
10100
3000
2.3
10012
2800
2.1
10049
2900
2.2
10057
2700
2.0
10027
2900
2.5
10077
2700
2.4
10072
2900
1.0
97
0
```
Conflicts:
src/tools/ceph_objectstore_tool.cc -- undetected conflict
with d5445b8. Fixed by
adopting the patch no not require the `unique_ptr<T>::get()`.
Fixes: https://tracker.ceph.com/issues/53729
Signed-off-by: Radosław Zarzyński <rzarzyns@redhat.com>
(cherry picked from commit a2190f9)
83de10c to
4393b74
Compare
|
Failures unrelated tracked in: https://tracker.ceph.com/issues/56036 - timeout 3h /home/ubuntu/cephtest/clone.client.0/qa/workunits/cephadm/test_dashboard_e2e.sh |
|
jenkins test api |
|
@amathuria I filed a new tracker for the |
… dups This commit aggregates changes for multiple PR: * Offline: ceph#46630 * Online: ceph#47046 * Offline fix: ceph#46706 * Online fix: ceph#47688 * Offline fix: ceph#46631 * Online fix: ceph#47701 Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
… dups This commit aggregates changes for multiple PR: * Offline: ceph#46630 * Online: ceph#47046 * Offline fix: ceph#46706 * Online fix: ceph#47688 * Offline fix: ceph#46631 * Online fix: ceph#47701 Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
… dups This commit aggregates changes for multiple PR: main ---- * Offline: ceph#46630 * Online: ceph#47046 quincy ------ * Offline fix: ceph#46706 * Online fix: ceph#47688 pacific ------- * Offline fix: ceph#46631 * Online fix: ceph#47701 Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
… dups This commit aggregates changes for multiple PR: main ---- * Offline: ceph#46630 * Online: ceph#47046 quincy ------ * Offline fix: ceph#46706 * Online fix: ceph#47688 pacific ------- * Offline fix: ceph#46631 * Online fix: ceph#47701 Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Backport of #46630.
Contribution Guidelines
To sign and title your commits, please refer to Submitting Patches to Ceph.
If you are submitting a fix for a stable branch (e.g. "pacific"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.
Checklist
Show available Jenkins commands
jenkins retest this pleasejenkins test classic perfjenkins test crimson perfjenkins test signedjenkins test make checkjenkins test make check arm64jenkins test submodulesjenkins test dashboardjenkins test dashboard cephadmjenkins test apijenkins test docsjenkins render docsjenkins test ceph-volume alljenkins test ceph-volume toxjenkins test windows