Actions
Bug #70939
closedcrimson: ceph_assert(interrupt_cond<InterruptCond>.interrupt_cond) in ReplicatedRecoveryBackend::recover_object
% Done:
0%
Source:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Tags (freeform):
Merge Commit:
Fixed In:
v20.0.0-1559-g11ce348d9f
Released In:
v20.2.0~533
Upkeep Timestamp:
2025-11-01T01:13:30+00:00
Description
ERROR 2025-04-15 21:38:20,513 [shard 1:main] none - /home/sam/git-checkouts/ceph-workspace/main/src/crimson/common/interruptible_future.h:485 : In function 'auto crimson::interruptible::interruptible_future_detai l<InterruptCond, seastar::future<T> >::then_interruptible(Func&&) [with Func = ReplicatedRecoveryBackend::recover_object(const hobject_t&, eversion_t)::<lambda()>; InterruptCond = crimson::osd::IOInterruptConditi on; T = void]', ceph_assert(%s) interrupt_cond<InterruptCond>.interrupt_cond
Seems to occur at the first call to then_interruptible in
LOG_PREFIX(ReplicatedRecoveryBackend::recover_object);
DEBUGDPP("{}, {}", pg, soid, need);
// always add_recovering(soid) before recover_object(soid)
assert(is_recovering(soid));
// start tracking the recovery of soid
return maybe_pull_missing_obj(
soid, need
).then_interruptible([FNAME, this, soid, need] {
...
called from
PGRecovery::interruptible_future<>
PGRecovery::recover_object_with_throttle(
const hobject_t &soid,
eversion_t need)
{
crimson::osd::scheduler::params_t params =
{1, 0, crimson::osd::scheduler::scheduler_class_t::background_best_effort};
auto &ss = pg->get_shard_services();
logger().debug("{} {}", soid, need);
return ss.with_throttle(
std::move(params),
[this, soid, need] {
logger().debug("got throttle: {} {}", soid, need);
auto backend = pg->get_recovery_backend();
assert(backend);
return backend->recover_object(soid, need);
});
}
introduced in 791772f1c032b4ca754d6a67322df6967edfc40e using
template <typename F>
auto with_throttle(
crimson::osd::scheduler::params_t params,
F &&f) {
if (!max_in_progress) return f();
return acquire_throttle(params)
.then(std::forward<F>(f))
.finally([this] {
release_throttle();
});
}
f() in the acquire_throttle() path is called without interrupt_cond<InterruptCond>.interrupt_cond populated.
Reproduces with
function start_cluster {
pkill -9 crimson-osd
../src/stop.sh
MDS=0 MGR=1 OSD=3 MON=1 ../src/vstart.sh --without-dashboard -X --redirect-output --debug -n --no-restart $@
./bin/ceph osd pool create rbd 8 8 replicated replicated_rule 2 2 2
./bin/ceph osd pool create single 1 1 replicated replicated_rule 2 2 2
}
function start_cluster_test_backfill {
start_cluster $@
./bin/ceph config set osd crimson_osd_scheduler_concurrency 5
./bin/ceph config set osd osd_min_pg_log_entries 1
./bin/ceph config set osd osd_max_pg_log_entries 2
./bin/ceph config set osd osd_pg_log_trim_min 0
./bin/ceph_test_rados --max-ops 10000000000 --objects 1000 --max-in-flight 32 --size 40000 --min-stride-size 4000 --max-stride-size 8000 --max-seconds 120 --op read 0 --op write 50 --op delete 50 --op snap_create 50 --pool rbd
sleep 5
./bin/ceph osd out 0
}
Updated by Samuel Just 11 months ago
I actually don't think this is a bug in interruptible_future, or at least not something we've tried to disallow statically in the past. with_throttle simply can't be agnostic as to whether f() assumes that the calling context has a live interrupt_cond.
Updated by Samuel Just 11 months ago
simpler example:
template <typename F>
auto f(F &&f) {
return seastar::sleep(
std::chrono::milliseconds(10)
).then([] {
return seastar::sleep(std::chrono::milliseconds(10));
}).then(
std::forward<F>(f)
).finally([] {});
}
using interruptor =
interruptible::interruptor<TestInterruptCondition>;
interruptor::future<> g() {
return f([] {
return interruptor::make_interruptible(
seastar::sleep(std::chrono::milliseconds(10))
).then_interruptible([] {
return interruptor::make_interruptible(
seastar::sleep(std::chrono::milliseconds(10)));
});
});
}
TEST_F(seastar_test_suite_t, implicit_interruptible_conversion)
{
run_async([] {
interruptor::with_interruption(
[] {
return interruptor::make_interruptible(
seastar::sleep(std::chrono::milliseconds(10))
).then_interruptible([] {
return g().then_interruptible([] {
return seastar::now();
});
});
},
[](auto) {}, false
).get();
});
}
Updated by Matan Breizman 11 months ago
- Status changed from In Progress to Resolved
Updated by Upkeep Bot 8 months ago
- Merge Commit set to 11ce348d9f062f78c72b82112d3ad758bf835ed7
- Fixed In set to v20.0.0-1559-g11ce348d9f0
- Upkeep Timestamp set to 2025-07-09T17:42:03+00:00
Updated by Upkeep Bot 8 months ago
- Fixed In changed from v20.0.0-1559-g11ce348d9f0 to v20.0.0-1559-g11ce348d9f
- Upkeep Timestamp changed from 2025-07-09T17:42:03+00:00 to 2025-07-14T17:42:40+00:00
Updated by Upkeep Bot 5 months ago
- Released In set to v20.2.0~533
- Upkeep Timestamp changed from 2025-07-14T17:42:40+00:00 to 2025-11-01T01:13:30+00:00
Actions