osd: calc_min_last_complete_ondisk() should use actingset#21508
osd: calc_min_last_complete_ondisk() should use actingset#21508dzafman wants to merge 1 commit intoceph:masterfrom
Conversation
Allow pg log to trim because min_last_complete_ondisk can advance Signed-off-by: David Zafman <dzafman@redhat.com>
| i != acting_recovery_backfill.end(); | ||
| assert(!actingset.empty()); | ||
| for (set<pg_shard_t>::iterator i = actingset.begin(); | ||
| i != actingset.end(); |
There was a problem hiding this comment.
This looks wrong to me. We still need to include async_recovery_targets here since they are basically doing recovery based on pg-logs. Excluding them when calculate min_last_complete_ondisk() can mean over-trimming of pg-logs of those peers and hence cause unexpected errors (e.g.: #21580).
|
@jdurgin A test run including this pull request failed with: src/osd/PGLog.cc: 169: FAILED assert(trim_to <= info.last_complete) teuthology:/a/dzafman-2018-04-28_22:38:02-rados:thrash-wip-zafman-testing-distro-basic-smithi/2453135 |
|
on osd.2, the primary, trim_to is advanced regardless of osd.6 (in async recovery)'s last_complete: Then osd.6 hits that assert, since the last_complete at that point is The goal of this is to put a hard cap on the number of log entries (and thus memory consumed), even during recovery. This means the last_complete can no longer be a bound for log trimming - if we re-peer we just go into backfill instead at that point. This is better than running out of memory. So it seems we'll need to adjust the pg log trimming + last_complete logic to accommodate this. |
|
See #21598 |
|
@jdurgin Reassigned http://tracker.ceph.com/issues/23979 to you |
|
|
||
| Option("osd_max_pg_log_entries", Option::TYPE_UINT, Option::LEVEL_ADVANCED) | ||
| .set_default(3000) | ||
| .set_default(10000) |
There was a problem hiding this comment.
I thought an earlier commit (PR 20394) had lowered this from 10000 to 3000. Isn't the spread between min and max pg log entries the amount that has to be buffered by the OSD process? But perhaps I'm getting branches confused or something like that? For background on this, see this article (available to non-Red Hat people on request).
There was a problem hiding this comment.
@bengland2 please note that this PR has been closed and we haven't made any of these changes.
Allow pg log to trim because min_last_complete_ondisk can advance