Skip to content

osd: calc_min_last_complete_ondisk() should use actingset#21508

Closed
dzafman wants to merge 1 commit intoceph:masterfrom
dzafman:wip-trello-fix
Closed

osd: calc_min_last_complete_ondisk() should use actingset#21508
dzafman wants to merge 1 commit intoceph:masterfrom
dzafman:wip-trello-fix

Conversation

@dzafman
Copy link
Copy Markdown
Contributor

@dzafman dzafman commented Apr 18, 2018

Allow pg log to trim because min_last_complete_ondisk can advance

Allow pg log to trim because min_last_complete_ondisk can advance

Signed-off-by: David Zafman <dzafman@redhat.com>
i != acting_recovery_backfill.end();
assert(!actingset.empty());
for (set<pg_shard_t>::iterator i = actingset.begin();
i != actingset.end();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks wrong to me. We still need to include async_recovery_targets here since they are basically doing recovery based on pg-logs. Excluding them when calculate min_last_complete_ondisk() can mean over-trimming of pg-logs of those peers and hence cause unexpected errors (e.g.: #21580).

@dzafman
Copy link
Copy Markdown
Contributor Author

dzafman commented Apr 29, 2018

@jdurgin A test run including this pull request failed with:

src/osd/PGLog.cc: 169: FAILED assert(trim_to <= info.last_complete)

teuthology:/a/dzafman-2018-04-28_22:38:02-rados:thrash-wip-zafman-testing-distro-basic-smithi/2453135

@jdurgin
Copy link
Copy Markdown
Member

jdurgin commented May 1, 2018

on osd.2, the primary, trim_to is advanced regardless of osd.6 (in async recovery)'s last_complete:

2018-04-29 06:30:04.056 7f395d09e700 10 osd.2 pg_epoch: 56 pg[3.1( v 56'3104 (48'2404,56'3104] local-lis/les=55/56 n=656 ec=23/23 lis/c 55/23 les/c/f 56/24/0 54/55/55) [6,2,4]/[2,4] async=[6] r=0 lpr=55 pi=[23,55)/1 luod=56'3103 rops=10 
crt=56'3104 lcod 56'3102 mlcod 56'3102 active+recovering+undersized+degraded+remapped mbc={255={(2+0)=159}}] calc_trim_to 51'2504 -> 51'2504

2018-04-29 06:30:04.056 7f395d09e700  1 -- 172.21.15.145:6814/13581 --> 172.21.15.89:6817/16610 -- osd_repop(client.4438.0:11995 3.1 e56/55 3:91c1d271:::smithi14518100-2665 oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo:head v 56'3105) v2 -- ?+788129 0x561e2c6bc000 con 0x561e2d759180

Then osd.6 hits that assert, since the last_complete at that point is lc 51'2453.

The goal of this is to put a hard cap on the number of log entries (and thus memory consumed), even during recovery. This means the last_complete can no longer be a bound for log trimming - if we re-peer we just go into backfill instead at that point. This is better than running out of memory.

So it seems we'll need to adjust the pg log trimming + last_complete logic to accommodate this.

@xiexingguo
Copy link
Copy Markdown
Member

See #21598

@dzafman dzafman added DNM and removed needs-review labels May 3, 2018
@dzafman
Copy link
Copy Markdown
Contributor Author

dzafman commented May 25, 2018

@jdurgin Reassigned http://tracker.ceph.com/issues/23979 to you

@dzafman dzafman closed this May 25, 2018

Option("osd_max_pg_log_entries", Option::TYPE_UINT, Option::LEVEL_ADVANCED)
.set_default(3000)
.set_default(10000)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought an earlier commit (PR 20394) had lowered this from 10000 to 3000. Isn't the spread between min and max pg log entries the amount that has to be buffered by the OSD process? But perhaps I'm getting branches confused or something like that? For background on this, see this article (available to non-Red Hat people on request).

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bengland2 please note that this PR has been closed and we haven't made any of these changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants