Skip to content

quincy: tools: ceph-objectstore-tool is able to trim pg log dups' entries.#46706

Merged
yuriw merged 1 commit intoceph:quincyfrom
rzarzynski:wip-pglog-trim-dups-quincy
Jun 17, 2022
Merged

quincy: tools: ceph-objectstore-tool is able to trim pg log dups' entries.#46706
yuriw merged 1 commit intoceph:quincyfrom
rzarzynski:wip-pglog-trim-dups-quincy

Conversation

@rzarzynski
Copy link
Contributor

@rzarzynski rzarzynski commented Jun 15, 2022

This is quincy's backport of #46630. The reverts are already merged.

The main assumption is trimming just dups doesn't need any update
to the corresponding pg_info_t.

Testing:

1. cluster without the autoscaler
```
rzarz@ubulap:~/dev/ceph/build$ MON=1 MGR=1 OSD=3 MGR=1 MDS=0 ../src/vstart.sh -l -b -n -o "osd_pg_log_dups_tracked=3000000" -o "osd_pool_default_pg_autoscale_mode=off"
```

2. 8 PGs in the testing pool.
```
rzarz@ubulap:~/dev/ceph/build$ bin/ceph osd pool create test-pool 8 8
```

3. Provisioning dups with rados bench
```
bin/rados bench -p test-pool 300 write -b 4096  --no-cleanup
...
Total time run:         300.034
Total writes made:      103413
Write size:             4096
Object size:            4096
Bandwidth (MB/sec):     1.34637
Stddev Bandwidth:       0.589071
Max bandwidth (MB/sec): 2.4375
Min bandwidth (MB/sec): 0.902344
Average IOPS:           344
Stddev IOPS:            150.802
Max IOPS:               624
Min IOPS:               231
Average Latency(s):     0.0464151
Stddev Latency(s):      0.0183627
Max latency(s):         0.0928424
Min latency(s):         0.0131932
```

4. Killing osd.0
```
rzarz@ubulap:~/dev/ceph/build$ kill 2572129 # pid of osd.0
```

5. Listing PGs on osd.0 and calculating number of pg log's entries and
dups:

```
rzarz@ubulap:~/dev/ceph/build$ bin/ceph-objectstore-tool --data-path dev/osd0 --op list-pgs --pgid 2.c > osd0_pgs.txt
rzarz@ubulap:~/dev/ceph/build$ for pgid in `cat osd0_pgs.txt`; do echo $pgid; bin/ceph-objectstore-tool --data-path dev/osd0 --op log --pgid $pgid | jq '(.pg_log_t.log|length),(.pg_log_t.dups|length)'; done
2.7
10020
3100
2.6
10100
3000
2.3
10012
2800
2.1
10049
2900
2.2
10057
2700
2.0
10027
2900
2.5
10077
2700
2.4
10072
2900
1.0
97
0
```

6. Trimming dups
```
rzarz@ubulap:~/dev/ceph/build$ CEPH_ARGS="--osd_pg_log_dups_tracked 2500 --osd_pg_log_trim_max=100" bin/ceph-objectstore-tool --data-path dev/osd0 --op trim-pg-log-dups --pgid 2.7
max_dup_entries=2500 max_chunk_size=100
Removing keys dup_0000000020.00000000000000000001 - dup_0000000020.00000000000000000100
Removing keys dup_0000000020.00000000000000000101 - dup_0000000020.00000000000000000200
Removing keys dup_0000000020.00000000000000000201 - dup_0000000020.00000000000000000300
Removing keys dup_0000000020.00000000000000000301 - dup_0000000020.00000000000000000400
Removing keys dup_0000000020.00000000000000000401 - dup_0000000020.00000000000000000500
Removing keys dup_0000000020.00000000000000000501 - dup_0000000020.00000000000000000600
Finished trimming, now compacting...
Finished trimming pg log dups
```

7. Checking number of pg log's entries and dups
```
rzarz@ubulap:~/dev/ceph/build$ for pgid in `cat osd0_pgs.txt`; do echo $pgid; bin/ceph-objectstore-tool --data-path dev/osd0 --op log --pgid $pgid | jq '(.pg_log_t.log|length),(.pg_log_t.dups|length)'; done
2.7
10020
2500
2.6
10100
3000
2.3
10012
2800
2.1
10049
2900
2.2
10057
2700
2.0
10027
2900
2.5
10077
2700
2.4
10072
2900
1.0
97
0
```

Fixes: https://tracker.ceph.com/issues/53729
Signed-off-by: Radosław Zarzyński <rzarzyns@redhat.com>
(cherry picked from commit a2190f9)
@ljflores
Copy link
Member

jenkins test api

@ljflores
Copy link
Member

Rados suite results:

https://pulpito.ceph.com/yuriw-2022-06-16_16:41:04-rados-wip-yuri6-testing-2022-06-16-0651-quincy-distro-default-smithi
https://pulpito.ceph.com/yuriw-2022-06-17_13:54:27-rados-wip-yuri6-testing-2022-06-16-0651-quincy-distro-default-smithi/

Failures, unrelated:
1. https://tracker.ceph.com/issues/55808
2. https://tracker.ceph.com/issues/52321
3. https://tracker.ceph.com/issues/53575
4. https://tracker.ceph.com/issues/55741
5. https://tracker.ceph.com/issues/53294
6. https://tracker.ceph.com/issues/55854
7. https://tracker.ceph.com/issues/55986

Details:
1. task/test_nfs: KeyError: 'events' - Ceph - Orchestrator
2. qa/tasks/rook times out: 'check osd count' reached maximum tries (90) after waiting for 900 seconds - Ceph - Orchestrator
3. Valgrind reports memory "Leak_PossiblyLost" errors concerning lib64 - Ceph - RADOS
4. cephadm/test_dashboard_e2e.sh: Unable to find element cd-modal .custom-control-label when testing on orchestrator/01-hosts.e2e-spec.ts - Ceph - Mgr - Dashboard
5. rados/test.sh hangs while running LibRadosTwoPoolsPP.TierFlushDuringFlush - Ceph - RADOS
6. Datetime AssertionError in test_health_history (tasks.mgr.test_insights.TestInsights) - Ceph - Mgr
7. cephadm: Test failure: test_cluster_set_reset_user_config (tasks.cephfs.test_nfs.TestNFS) - Ceph - Orchestrator

@yuriw yuriw merged commit 5da2ce6 into ceph:quincy Jun 17, 2022
rzarzynski added a commit to rzarzynski/ceph that referenced this pull request Aug 23, 2022
… dups

This commit aggregates changes for multiple PR:

* Offline: ceph#46630
* Online: ceph#47046

* Offline fix: ceph#46706
* Online fix: ceph#47688

* Offline fix: ceph#46631
* Online fix: ceph#47701

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
rzarzynski added a commit to rzarzynski/ceph that referenced this pull request Aug 23, 2022
… dups

This commit aggregates changes for multiple PR:

* Offline: ceph#46630
* Online: ceph#47046

* Offline fix: ceph#46706
* Online fix: ceph#47688

* Offline fix: ceph#46631
* Online fix: ceph#47701

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
rzarzynski added a commit to rzarzynski/ceph that referenced this pull request Aug 23, 2022
… dups

This commit aggregates changes for multiple PR:

main
----
* Offline: ceph#46630
* Online: ceph#47046

quincy
------
* Offline fix: ceph#46706
* Online fix: ceph#47688

pacific
-------
* Offline fix: ceph#46631
* Online fix: ceph#47701

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
rzarzynski added a commit to rzarzynski/ceph that referenced this pull request Aug 23, 2022
… dups

This commit aggregates changes for multiple PR:

main
----
* Offline: ceph#46630
* Online: ceph#47046

quincy
------
* Offline fix: ceph#46706
* Online fix: ceph#47688

pacific
-------
* Offline fix: ceph#46631
* Online fix: ceph#47701

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants