Bug #68960
openec-pool-snaps-few-objects-overwrites: failed to complete snap trimming before timeout
0%
Description
/a/yuriw-2024-10-29_22:45:20-rados-quincy-release-distro-default-smithi/7972687
2024-10-30T09:06:20.118 INFO:tasks.ceph.ceph_manager.ceph:pg 3.1d3 in trimming, state: active+clean+snaptrim_wait 2024-10-30T09:06:20.118 INFO:tasks.ceph.ceph_manager.ceph:pg 3.1d4 in trimming, state: active+clean+snaptrim_wait 2024-10-30T09:06:20.118 INFO:tasks.ceph.ceph_manager.ceph:pg 3.1d5 in trimming, state: active+clean+snaptrim_wait 2024-10-30T09:06:20.118 INFO:tasks.ceph.ceph_manager.ceph:pg 3.1d6 in trimming, state: active+clean+snaptrim_wait 2024-10-30T09:06:20.119 INFO:tasks.ceph.ceph_manager.ceph:pg 3.1d8 in trimming, state: active+clean+snaptrim_wait 2024-10-30T09:06:20.119 INFO:tasks.ceph.ceph_manager.ceph:pg 3.1da in trimming, state: active+clean+snaptrim_wait 2024-10-30T09:06:20.119 INFO:tasks.ceph.ceph_manager.ceph:pg 3.1dd in trimming, state: active+clean+snaptrim_wait 2024-10-30T09:06:20.119 INFO:tasks.ceph.ceph_manager.ceph:pg 3.1de in trimming, state: active+clean+snaptrim_wait 2024-10-30T09:06:20.119 INFO:tasks.ceph.ceph_manager.ceph:pg 3.1e5 in trimming, state: active+clean+snaptrim_wait 2024-10-30T09:06:20.119 INFO:tasks.ceph.ceph_manager.ceph:pg 3.1e6 in trimming, state: active+clean+snaptrim_wait 2024-10-30T09:06:20.119 INFO:tasks.ceph.ceph_manager.ceph:pg 3.1e8 in trimming, state: active+clean+snaptrim_wait 2024-10-30T09:06:20.119 INFO:tasks.ceph.ceph_manager.ceph:pg 3.1e9 in trimming, state: active+clean+snaptrim_wait 2024-10-30T09:06:20.119 INFO:tasks.ceph.ceph_manager.ceph:pg 3.1ea in trimming, state: active+clean+snaptrim_wait 2024-10-30T09:06:20.119 INFO:tasks.ceph.ceph_manager.ceph:pg 3.1eb in trimming, state: active+clean+snaptrim_wait 2024-10-30T09:06:20.119 INFO:tasks.ceph.ceph_manager.ceph:pg 3.1ed in trimming, state: active+clean+snaptrim_wait 2024-10-30T09:06:20.119 INFO:tasks.ceph.ceph_manager.ceph:pg 3.1ee in trimming, state: active+clean+snaptrim_wait 2024-10-30T09:06:20.119 INFO:tasks.ceph.ceph_manager.ceph:pg 3.1f0 in trimming, state: active+clean+snaptrim_wait 2024-10-30T09:06:20.119 INFO:tasks.ceph.ceph_manager.ceph:pg 3.1f1 in trimming, state: active+clean+snaptrim_wait 2024-10-30T09:06:20.119 INFO:tasks.ceph.ceph_manager.ceph:pg 3.1f2 in trimming, state: active+clean+snaptrim_wait 2024-10-30T09:06:20.119 INFO:tasks.ceph.ceph_manager.ceph:pg 3.1f4 in trimming, state: active+clean+snaptrim_wait 2024-10-30T09:06:20.120 INFO:tasks.ceph.ceph_manager.ceph:pg 3.1f5 in trimming, state: active+clean+snaptrim_wait 2024-10-30T09:06:20.120 INFO:tasks.ceph.ceph_manager.ceph:pg 3.1f6 in trimming, state: active+clean+snaptrim_wait 2024-10-30T09:06:20.120 INFO:tasks.ceph.ceph_manager.ceph:pg 3.1f8 in trimming, state: active+clean+snaptrim_wait 2024-10-30T09:06:20.120 INFO:tasks.ceph.ceph_manager.ceph:pg 3.1fb in trimming, state: active+clean+snaptrim_wait 2024-10-30T09:06:20.120 INFO:tasks.ceph.ceph_manager.ceph:pg 3.1ff in trimming, state: active+clean+snaptrim_wait 2024-10-30T09:06:20.120 INFO:tasks.ceph.ceph_manager.ceph:pg 3.200 in trimming, state: active+clean+snaptrim_wait 2024-10-30T09:06:20.120 INFO:tasks.ceph.ceph_manager.ceph:pg 3.202 in trimming, state: active+clean+snaptrim_wait 2024-10-30T09:06:20.120 INFO:tasks.ceph.ceph_manager.ceph:pg 3.205 in trimming, state: active+clean+snaptrim_wait 2024-10-30T09:06:20.120 INFO:tasks.ceph.ceph_manager.ceph:pg 3.206 in trimming, state: active+clean+snaptrim_wait 2024-10-30T09:06:20.120 INFO:tasks.ceph.ceph_manager.ceph:pg 3.209 in trimming, state: active+clean+snaptrim_wait 2024-10-30T09:06:20.120 INFO:tasks.ceph.ceph_manager.ceph:pg 3.20b in trimming, state: active+clean+snaptrim_wait 2024-10-30T09:06:20.120 INFO:tasks.ceph.ceph_manager.ceph:pg 3.20d in trimming, state: active+clean+snaptrim_wait 2024-10-30T09:06:20.120 INFO:tasks.ceph.ceph_manager.ceph:pg 3.20e in trimming, state: active+clean+snaptrim_wait 2024-10-30T09:06:20.120 INFO:tasks.ceph.ceph_manager.ceph:pg 3.210 in trimming, state: active+clean+snaptrim_wait 2024-10-30T09:06:20.120 INFO:tasks.ceph.ceph_manager.ceph:pg 3.213 in trimming, state: active+clean+snaptrim_wait
Updated by Radoslaw Zarzynski over 1 year ago
I can't find any tracker this one would duplicate.
Either it's new (regression on quincy?) or – more probably, I hope – it's an environment issue (starvation?).
Let's observe.
Updated by Radoslaw Zarzynski over 1 year ago
scrub note: no reoccurrences so far, observing.
Updated by Radoslaw Zarzynski about 1 year ago
scrub note: still no reoccurance.
Updated by Laura Flores about 2 months ago
description: rados/thrash/{0-size-min-size-overrides/3-size-2-min-size 1-pg-log-overrides/short_pg_log
2-recovery-overrides/{more-async-partial-recovery} 3-scrub-overrides/{max-simultaneous-scrubs-3}
backoff/normal ceph clusters/{fixed-4} crc-failures/default d-balancer/crush-compat
mon_election/connectivity msgr-failures/osd-dispatch-delay msgr/async-v1only objectstore/{bluestore/{alloc$/{stupid}
base mem$/{low} onode-segment$/{512K} write$/{v1/{compr$/{yes$/{zlib}} v1}}}} rados
supported-random-distro$/{ubuntu_latest} thrashers/morepggrow thrashosds-health
workloads/pool-snaps-few-objects}
/a/lflores-2026-01-26_23:21:06-rados-wip-yuri12-testing-2026-01-22-2045-distro-default-trial/19086
Updated by Laura Flores about 2 months ago
Severity seems low, let's monitor for now.
Updated by Connor Fawcett about 2 months ago
/a/skanta-2026-01-27_07:02:07-rados-wip-bharath3-testing-2026-01-26-1323-distro-default-trial/19858
Updated by Laura Flores 28 days ago
Bug scrub note: let's observe for a bit longer
Updated by Sridhar Seshasayee 10 days ago
/a/skanta-2026-03-04_23:53:38-rados-wip-bharath1-testing-2026-03-04-1011-distro-default-trial/85628
Test Description:
rados/thrash/0-size-min-size-overrides-min-size 1-pg-log-overrides/short_pg_log 2-recovery-overrides/{more-async-partial-recovery} 3-scrub-overrides/{max-simultaneous-scrubs-1} backoff/normal ceph clusters/{fixed-4} crc-failures/default d-balancer/crush-compat mon_election/connectivity msgr-failures/osd-dispatch-delay msgr/async-v1only objectstore/{bluestore/{alloc$/{btree} base mem$/{normal-1} onode-segment$/{256K} write$/{v1/{compr$/{yes$/{zstd}} v1}}}} rados supported-random-distro$/{centos_latest} thrashers/morepggrow thrashosds-health workloads/pool-snaps-few-objects}
Updated by Laura Flores 7 days ago
- Assignee set to Matan Breizman
@Matan Breizman can you take a look at this? We think it could perhaps be a timing/new machine issue with the trial nodes due to recent occurrences happening roughly when the trial nodes were introduced.
The common workload among the failures is `workloads/pool-snaps-few-objects`.