Skip to content

Enhanced WAL replay for duplicate series record#7438

Merged
codesome merged 1 commit intoprometheus:mainfrom
codesome:wal-mmap-replay
Aug 3, 2021
Merged

Enhanced WAL replay for duplicate series record#7438
codesome merged 1 commit intoprometheus:mainfrom
codesome:wal-mmap-replay

Conversation

@codesome
Copy link
Member

@codesome codesome commented Jun 23, 2020

In this PR I fix a TODO I had put before // TODO(codesome) Discard old samples and mmapped chunks and use mmap chunks for the new series ID.

Assertion:
Basically, if there is a duplicate series record in the WAL, it means that the compaction happened in between and the series including all it's samples were compacted before this duplicate record. Hence, we can discard all that we have replayed till now for that series when we come across a duplicate record.

@codesome
Copy link
Member Author

I have simplified it further now compared to what it was in #7229 on a simple assertion mentioned in the PR description. Earlier I was comparing the m-mapped chunks of new series id with the old series id, which made it complex. Now we just keep m-mapped chunks for the series id in the latest record.

@roidelapluie
Copy link
Member

This now has conflicts

@krasi-georgiev krasi-georgiev self-assigned this Aug 10, 2020
@codesome
Copy link
Member Author

I think I will be taking care of this in #7229, but will keep this open till then in case I don't do it there. So, no need to review this PR right now. (I will be getting to snapshotting chunks in a few weeks, caught up with some other stuff)

@codesome
Copy link
Member Author

I take back my above comment. Let's do it in this PR itself to keep the review of snapshot PR simpler. I have rebased this and is ready for another review.

@krasi-georgiev krasi-georgiev removed their assignment Sep 17, 2020
@roidelapluie
Copy link
Member

Hello @bwplotka Are you interested in reviewing this? Thanks!

Copy link
Member

@bwplotka bwplotka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments.

Copy link
Member

@bwplotka bwplotka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some pair review comments (:

@codesome
Copy link
Member Author

With some more ordering assumptions that we can make, the code is much simpler now and easy to understand. I have push the changes.

@codesome
Copy link
Member Author

But it has some races :( I will fix them

@stale stale bot added the stale label Dec 20, 2020
@stale stale bot removed the stale label Jan 22, 2021
@codesome codesome force-pushed the wal-mmap-replay branch 2 times, most recently from 2483891 to fc88ed1 Compare January 23, 2021 11:21
@codesome
Copy link
Member Author

Race fixed. Ready for review again.

Base automatically changed from master to main February 23, 2021 19:36
@stale stale bot added the stale label Apr 26, 2021
@stale stale bot removed the stale label Aug 2, 2021
@codesome
Copy link
Member Author

codesome commented Aug 2, 2021

I plan to continue the work on snapshots on shutdown and this PR is a prerequisite for that. I have rebased this PR to remove the conflicts. It is ready for another review now.

@codesome codesome requested review from bwplotka and removed request for krasi-georgiev August 2, 2021 12:53
Copy link
Member

@bwplotka bwplotka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks good, just minor nits and one question. Otherwise LGTM!

tsdb/head.go Outdated
// could cause race below. So we wait for the goroutine to empty input the buffer.
idx := mSeries.ref % uint64(n)
inputs[idx] <- []record.RefSample{} // To ensure that all the old samples are processed.
for len(inputs[idx]) != 0 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is first time I see this pattern of checking len on channel, nice. Isn't that racy, when writer is not fast enough to give elements to input[idx] chan and buffer is read faster than writer writes?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The writer is giving elements to input[idx] in the same loop, and it is serial to this line. So when we are here, there is nothing giving elements to any channels in input slice. So there won't be any read-write race, there is only reading.

@codesome
Copy link
Member Author

codesome commented Aug 3, 2021

I will wait for #9147 to be merged before this.

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
@codesome codesome merged commit 848cb5a into prometheus:main Aug 3, 2021
austince pushed a commit to austince/prometheus that referenced this pull request Aug 4, 2021
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
@bboreham bboreham mentioned this pull request Aug 7, 2021
Comment on lines +221 to +222
inputs[idx] <- []record.RefSample{}
for len(inputs[idx]) != 0 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both of these lines can deadlock; see #9169 for one case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants