Skip to content

wlog: Optimized and refactored watcher code.#16182

Open
bwplotka wants to merge 2 commits intomainfrom
watcher-opt
Open

wlog: Optimized and refactored watcher code.#16182
bwplotka wants to merge 2 commits intomainfrom
watcher-opt

Conversation

@bwplotka
Copy link
Member

@bwplotka bwplotka commented Mar 7, 2025

This cleans up the watcher structure and tests. No functionality should be changed.

Changes:

  • Correctly reuse Ref slices. Seems by accident they were shadowing variables.
  • Add zeropools for even more performance, ensured interface WriteTo is compatible (it allows pooling).
  • Simplified Watcher public methods.
  • Removed unnecessary leaked abstractions and test code in production struct e.g. starttime, max segment, read timeout, eofNonErr
  • Made the tailing/reading series less confusing. TIL watcher reads only from time.Now after retry and ONLY from the latest segment (: Shouldn't it read from the beginning?
  • Removed obsolete queue_manager benchmark.

I noticed the need of this in #16046 (profiling suggesting watcher decoding being a significant overhead no matter of sample type).

@bwplotka bwplotka requested a review from bboreham March 7, 2025 09:07
@bwplotka bwplotka force-pushed the watcher-opt branch 2 times, most recently from d8d616f to 2ee34a1 Compare March 7, 2025 09:14
@bwplotka bwplotka marked this pull request as ready for review March 7, 2025 09:14
@bwplotka bwplotka requested a review from jesusvazquez as a code owner March 7, 2025 09:14
@bwplotka bwplotka marked this pull request as draft March 7, 2025 09:34
@bboreham
Copy link
Member

Shouldn't it read from the beginning

Beginning is hours ago, possibly tens of hours. See #8809 for another idea.

Copy link
Member

@bboreham bboreham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Scary to change this code, but the shadowing is a great catch.
I haven't done a full review, just a quick initial scan.

@bwplotka bwplotka marked this pull request as ready for review March 10, 2025 13:50
@bwplotka
Copy link
Member Author

Thanks for a quick review @bboreham - I was fighting with the super confusing behaviour of LiveReader (the eofNonErr testing flag in production code....). Proposed much cleaner approach (IMO). Addressed the initial comments too. Will work on the benchmark results now, but marking as good for review.

@bwplotka

This comment was marked as resolved.

@bwplotka bwplotka force-pushed the watcher-opt branch 3 times, most recently from 47cf03b to efd6289 Compare March 10, 2025 14:41
@bwplotka bwplotka force-pushed the watcher-opt branch 3 times, most recently from eca1e09 to c76a5e3 Compare March 10, 2025 16:00
bboreham added a commit to bboreham/prometheus that referenced this pull request Mar 10, 2025
The `:=` causes new variables to be created, which means the outer
slice stays at nil, and new memory is allocated every time round the
loop.

Extracted from prometheus#16182
Credit to @bwplotka.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
@bboreham
Copy link
Member

I'm excited about the pessimisation you found - extracted #16197 in the hope that we can merge it faster than this 900-line PR.

bwplotka pushed a commit that referenced this pull request Mar 11, 2025
The `:=` causes new variables to be created, which means the outer
slice stays at nil, and new memory is allocated every time round the
loop.

Extracted from #16182
Credit to @bwplotka.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
Signed-off-by: bwplotka <bwplotka@gmail.com>

# Conflicts:
#	tsdb/wlog/watcher_test.go

# Conflicts:
#	tsdb/wlog/watcher_test.go

# Conflicts:
#	tsdb/wlog/watcher.go
@bwplotka bwplotka force-pushed the watcher-opt branch 3 times, most recently from b96b8f8 to e325164 Compare March 11, 2025 12:12
Signed-off-by: bwplotka <bwplotka@gmail.com>

# Conflicts:
#	tsdb/wlog/watcher_test.go

# Conflicts:
#	tsdb/wlog/watcher_test.go

# Conflicts:
#	tsdb/wlog/watcher.go
@bwplotka
Copy link
Member Author

Should be ready for review @bboreham PTAL

machine424 pushed a commit to machine424/prometheus that referenced this pull request Mar 26, 2025
The `:=` causes new variables to be created, which means the outer
slice stays at nil, and new memory is allocated every time round the
loop.

Extracted from prometheus#16182
Credit to @bwplotka.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
@github-actions github-actions bot added the stale label May 16, 2025
bernot-dev added a commit to bernot-dev/prometheus that referenced this pull request Aug 12, 2025
As mentioned in prometheus#16182, the BenchmarkStartup test for the queue manager
covers an old API and uses settings that will not occur in productio
bernot-dev added a commit to bernot-dev/prometheus that referenced this pull request Aug 12, 2025
As mentioned in prometheus#16182, the BenchmarkStartup test for the queue manager
covers an old API and uses settings that will not occur in production
bernot-dev added a commit to bernot-dev/prometheus that referenced this pull request Aug 12, 2025
As mentioned in prometheus#16182, the BenchmarkStartup test for the queue manager
covers an old API and uses settings that will not occur in production

Signed-off-by: Adam Bernot <bernot@google.com>
perebaj pushed a commit to perebaj/prometheus that referenced this pull request Aug 22, 2025
As mentioned in prometheus#16182, the BenchmarkStartup test for the queue manager
covers an old API and uses settings that will not occur in production

Signed-off-by: Adam Bernot <bernot@google.com>
Signed-off-by: perebaj <perebaj@gmail.com>
@bboreham bboreham self-assigned this Oct 28, 2025
@bboreham
Copy link
Member

Hello from the bug-scrub! I will take another look.

@github-actions github-actions bot removed the stale label Oct 28, 2025
@bwplotka
Copy link
Member Author

bwplotka commented Nov 3, 2025

This needs a rebase, will do soon

@bboreham
Copy link
Member

I'm looking at this PR; I've done the rebase locally, if you are brave enough to let me push it.

Copy link
Member

@bboreham bboreham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally fine, though I would like to resolve the TODOs before merging.
I didn't follow all the test changes; perhaps describe them more fully in the PR?

// We do this here rather than in the constructor because of the ordering of
// creating Queue Managers's, stopping them, and then starting new ones in
// storage/remote/storage.go ApplyConfig.
func (w *Watcher) initMetrics() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

startMetrics in comment vs initMetrics in code


// readSegment reads all known records into w.writer from a segment.
// It returns the EOF error if the segment is corrupted or partially written.
func (w *Watcher) readSegment(r *LiveReader, startT int64, segmentNum int) (err error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be clearer if named more like readSegmentAllData, since the counterpart is readSegmentSeries.

// One table per WAL segment means it won't grow indefinitely.
dec = record.NewDecoder(labels.NewSymbolTable())

// TODO(bwplotka): Consider zeropools.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pool should be worse, since we are already reusing memory and there is only ever zero or one slice of each type in use.

Comment on lines +583 to +586
if h.T > startT {
if !w.replayDone {
w.replayDone = true
w.logger.Info("Done replaying WAL", "duration", time.Since(timestamp.Time(startT)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this be a function, to avoid repeating it three times?

break
}
metadata, err = dec.Metadata(rec, metadata[:0])
meta, err := dec.Metadata(rec, metadata[:0])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an accident, I think, reverting part of the optimisation from #16197.

Comment on lines +37 to +38
// TODO(bwplotka): Checking every 100ms feels too frequent. It might be enough
// to check on notify AND with emergency 15s read only.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that scanning the whole WAL directory every 100ms is too much.
However, this is the only way that watchSegment returns (except for errors), and hence the only way we move to the next segment. We cannot wait 15s for that.
Possibly we could extend Notify to include the information that there is a new segment?
Maybe we can observe that we got notified and no new data was in the file, so we should look at that point for a new segment?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants