Append metadata to the WAL in the scrape loop by tpaschalis · Pull Request #10312 · prometheus/prometheus

tpaschalis · 2022-02-17T20:15:35Z

Signed-off-by: Paschalis Tsilias paschalist0@gmail.com

Changes

We've recently merged support for Metadata in the TSDB on d1122e0.

This PR uses a feature flag for enabling the scrape loop to detect changes in metadata in both new and old(cached) series, and pass them along to the MetadataUpdater interface. The metadata we're passing along will be stored in the headAppender and encoded in the WAL.

~~This PR updates Prometheus to store metadata in its Write-Ahead Log. To achieve this, a new MetadataAppender interface is introduced and embedded in Appender.~~

~~Its method implementation is used to store a RefMetadata slice as a headAppender field, which is then encoded in the WAL according to the proposed format.~~

~~The scrape loop is also modified to detect changes in metadata in both new and old (cached) series and pass them along.~~

TODO

Check Bryan's suggestion of finding a way to avoid text->number conversion in the TSDB code and standardizing around one implementation.

Prior Art

This is an attempt at a revival of #7771 and other discussions around storing and utilizing Metadata throughout Prometheus.

Unlike the reference work, this only deals with appending metadata to the WAL, and aims to start a bigger discussion around how we can make Metadata more useful throughout Prometheus.

As such, reading from the WAL and sending over to remote_write is not in the scope of this PR and will be handled with a future one. Required Changes to the agent mode are also not currently present.

Older Mailing list discussion: https://groups.google.com/g/prometheus-developers/c/w-eotGenBGg/m/UZWdxzcBBwAJ

Older Design doc: https://docs.google.com/document/d/1LY8Im8UyIBn8e3LJ2jB-MoajXkfAqW2eKzY735aYxqo/edit#heading=h.bik9uwphqy3g

Notes

After I get some feedback on the overall idea, I will extend the test suite to cover more cases and increase overall confidence on how the feature works and interacts with different pieces of Prometheus.

We've merged the TSDB code parts, with tests covering the encoding/decoding functionality, checkpointing, verifying the format of the data on disk, as well as in memory during writing and after a replay.

I'll be adding more tests to gain more confidence in the way that the scrape loop handles the updating of metadata.

cstyan

I don't have much to say right now, looks relatively safe.

We should get a sanity check for the scrape code from @roidelapluie and for the TSDB code from @codesome

cstyan · 2022-02-18T23:41:24Z

tsdb/head_append.go

+	}
+
+	hasNewMetadata := false
+	s.Lock()


Should we potentially use the RLock() here and then only lock if we need to overwrite s.meta? Or potentially introduce a new metadataLock to memSeries?

I'm not really sure which is the best option, but personally I'd avoid adding extra fields to an already-large struct for each and every series.

Were you also thinking of something like 045ec58?

tsdb/docs/format/wal.md

beorn7 · 2022-03-29T15:55:12Z

@roidelapluie @bwplotka could you have a look at this? We Grafanistas thought it wouldn't be kosher to do this just among ourselves.

@tpaschalis is currently working on getting the tests passed. Beyond that, it would be great to unblock this PR.

rfratto · 2022-04-12T15:41:08Z

@roidelapluie do you have time to take a look at this?

bwplotka · 2022-04-19T12:35:26Z

I will take a look this week, hopefully today 🤞🏽

bwplotka

Did a first general pass. Looks clean, just let's make sure we are not introducing any regression even for those who does not want to append metadata to WAL (should we add this in opt-in manner?), also should it be a feature flag?

scrape/scrape.go

bwplotka · 2022-04-19T18:55:54Z

scrape/scrape.go

 				break loop
 			}
+
+			sl.cache.metaMtx.Lock()


Hmmm, those locks are a bit scary here. I assume lock is here because we want the latest metadata across all sources of a single metric name, right? Which could be inconsistent during scrape. Or am I wrong?

Nevertheless, we are on the hot path of scraping which has to be efficient, so we have to ideally benchmark this path. WDYT? Ideally with cases of potential lock congestion. If there is regression, we could opt in for some less consistent framework for choosing what metadata should stay.

Also for "error?" cases where process A is registering __name__=metric1 with help ABC and process B registering __name__=metric1 with help ABC2, we will constantly update those metadata back and worth in worse case, right?

Sorry @bwplotka for the super late reply.

Yes, I think we can go with a feature flag until we see its effects on larger-scale Prometheus deployments.

I assume lock is here because we want the latest metadata across all sources of a single metric name, right? Which could be inconsistent during scrape. Or am I wrong?

Yes, that's what we hold as metadata in the scrapeCache. As you said, this inconsistency will happen when two different scrape_configs share a common metric name and try to simultaneously update its metadata, right? I think that yes, in the worst case that two processes register different metadata in each scrape iteration, we will be recording all these updates.

so we have to ideally benchmark this path. WDYT? Ideally with cases of potential lock congestion

I'm building a benchmark to measure the effect and will come back to you. We could also consider using read locks as we're checking whether an update is needed.

Hey @bwplotka! About the metaMtx mutex in scrapeCache; if I'm getting this right, the scrapeCache is unique per scrapeLoop.

In that case, there will be no contention for this mutex between multiple scrapers, right? Each scraper will have its own lock to use as if it's encountering new metadata. The things that do run concurrently and might contend the mutex are a) the methods used by the MetadataWatcher in the QueueManager and b) the MetadataMetricsCollector which reports metrics to the scrape manager. If believe those use cases weren't needed, we wouldn't even need metaMtx.

If that's true, does it relieve some of the concerns around these locks?

I've run some benchmarks (tpaschalis@177d95a) to get a feel of the effect of this on the scrape loop after rebasing with the current main (070e409).

The first benchmark is the existing BenchmarkScrapeLoopAppend. The second one changes metadata on every scrape iteration, while the third one both changes metadata on every iteration and tries to obtain the metadata mutex every millisecond (not a realistic scenario with the current use cases).

$ benchstat main pr10312 name old time/op new time/op delta ScrapeLoopAppend-8 41.9µs ± 3% 43.4µs ± 6% +3.50% (p=0.000 n=19+20) ScrapeLoopAppend_MetadataChangesAlways-8 47.6µs ± 2% 51.7µs ± 2% +8.71% (p=0.000 n=19+20) ScrapeLoopAppend_MetadataChangesAlwaysContended-8 48.9µs ± 3% 52.7µs ± 1% +7.88% (p=0.000 n=19+17)

How do you feel about these benchmarks?

if I'm getting this right, the scrapeCache is unique per scrapeLoop. this is how I read the code as well. The newLoop function, part of scrapePool, takes in the scrapeLoopOptions struct. If the cache field on that struct is nil a new scrape cache is created. When a sync happens and the new scrape loops are started (some of which won't really have changed) the existing scrape loops are passed where/when relevant.

tsdb/docs/format/wal.md

cstyan · 2022-05-24T21:31:29Z

As others have mentioned, it would be a good idea to put this behind a feature flag initially. Have a look at cmd/prometheus/main.go, specifically the function setFeatureListOptions for how we do this for other experimental features.

tpaschalis · 2022-05-26T11:43:43Z

@cstyan @bwplotka I've put the feature behind a feature flag named metadata-in-wal 9d4ae52.

Only thing that was clunky to put behind the feature flag was the new meta field in cacheEntry to be used by addRef. If you think we should append empty metadata values in addRef instead when the feature is disabled, I'll see to that.

Just to be sure, I ran the benchmark again (with the flag disabled) to verify that there isn't any leak.

$ benchstat main pr10312
name                                               old time/op  new time/op  delta
ScrapeLoopAppend-8                                 42.3µs ± 4%  42.4µs ± 5%   ~     (p=0.758 n=20+20)
ScrapeLoopAppend_MetadataChangesAlways-8           48.2µs ± 3%  47.9µs ± 4%   ~     (p=0.113 n=19+18)
ScrapeLoopAppend_MetadataChangesAlwaysContended-8  49.8µs ± 8%  49.6µs ± 3%   ~     (p=0.862 n=20+19)

Do you have any more comments at this stage? How would you prefer that we proceed?

codesome · 2022-05-26T21:14:56Z

I plan to review the tsdb changes sometime next week

codesome

I have done some initial review for the tsdb. I am calling it initial, because there are a bunch of things missing (only talking about tsdb/).

We need unit tests for
- The ingestion of the metadata with different error scenarios and checking if we are writing only changed metadatas.
- Making sure we are writing it into the WAL properly and are able to read it back from the WAL.
- Checkpointing is happening as expected (check my comment about checkpoint, maybe we need to change the logic)
We are missing WAL replay for these records. Is it intentional? If it is, then every restart will add a whole lot of duplicate metadata records. If it is not, we will need to add it (including tests for that).

storage/interface.go

tsdb/docs/format/wal.md

tsdb/head_append.go

codesome · 2022-06-07T01:38:22Z

tsdb/wal/checkpoint.go

+			if err != nil {
+				return nil, errors.Wrap(err, "decode metadata")
+			}
+			// Drop irrelevant metadata in place.


The implementation feels like we will keep all the historical metadata changes of a series as long as the series exists. Should we not keep just the last metadata of a series in this case?

Yeah, that's true. I thought this wouldn't have much impact as a) metadata isn't usually rapidly changing, and b) if the metadata does change then it might make sense to be able to correlate it with the values associated with a series (eg. the unit changing from seconds to milliseconds).

If we do want to keep the latest metadata entry only, we don't have any timestamps associated with them so I can't say how we'd go about it off the top of my head, but I'll take a look just in case we want to move to that direction.

It makes sense to keep a metadata around until all the samples after that have been removed from the checkpoint until the new metadata appears for the reason you said - make sense of the old data. But I am not sure how efficiently we can do that. Once idea is to store the timestamp of the last sample seen when the metadata was updated.

If we do want to keep the latest metadata entry only, we don't have any timestamps associated with them so I can't say how we'd go about it off the top of my head, but I'll take a look just in case we want to move to that direction.

This one is relatively simple but with one complexity. Similar to the keep(ref) method that we have, we have another keepMeta(ref, meta), which returns true if the metadata matches the one present in the head block currently, i.e. the latest. Else false. Problem here is that, between 2 checkpoints, a series might have metadata M1, then switched to M2, then again back to M1, in which case we will end up keeping 2 copies of M1. We need to think how to dedupe this. Maybe a timestamp of some sort will help to identify the latest.

The WAL is written to in-order, right? We can assume that for a given ref ID, metadata entries are written in-order and we can keep only the latest metadata entry for each ref ID. We shouldn't need to include timestamps, IIUC.

Sorry for the late response. I think @rfratto's suggestion should work nicely.

One way to go about this would be to just keep track of the latest metadata for each series in the for r.Next() loop, and then flush them separately along the remaining records in the end. I'll try it out to see how it works in practice, but I also wanted to hear how it sounds to y'all.

Edit: I've implemented the deduplication logic on 36cb2b6 along with a relevant test. Let me know how it looks to you, I'll try to see if I can bring some benchmarks.

With this, I think we might be able to avoid complicating the WAL storage format.

If we do not care of old metadata, then definitely, just keep the last one!

One of the use-cases brought up was keeping track of changing metadata. Do we not care about knowing what the series meant a month ago and only care about what it means now?

tsdb/record/record.go

tsdb/record/record_test.go

tpaschalis · 2022-06-07T12:45:03Z

Hey @codesome thanks for all the comments!

We need unit tests for :

The ingestion of the metadata with different error scenarios and checking if we are writing only changed metadatas.

Making sure we are writing it into the WAL properly and are able to read it back from the WAL.

Checkpointing is happening as expected (check my comment about checkpoint, maybe we need to change the logic)

Yes, true! I wanted some feedback on the general direction before starting, but I'll go ahead and add some tests to increase our level of confidence

We are missing WAL replay for these records. Is it intentional? If it is, then every restart will add a whole lot of duplicate metadata records. If it is not, we will need to add it (including tests for that).

When I initially discussed this with @gotjosh, the idea was to split up the work into smaller PRs for easier review; that is

an initial PR to write to the WAL and figuring out the storage format (the current one),
then another for reading this data from the WAL and finally
figuring out how to propagate this over to remote_write and replace the current logic reading metadata from memory.

The work here is only concerned with the first part, so that's why Metadata blocks are just skipped over when reading. I see your point about adding duplicate records, but do you think this is an acceptable tradeoff since this is behind a feature flag and wouldn't directly impact everyone? Or do you feel we should prioritize reading them into a metadataPool as part of this PR?

codesome · 2022-06-07T23:14:00Z

@tpaschalis
Since you are anticipating more metadata fields to come in future, I suggest this WAL format similar to a series which does not assume the field based on it's position:

┌──────────────────────────────────────────────┐
│                type = 5 <1b>                 │
├──────────────────────────────────────────────┤
│ ┌──────────────────────────────────────────┐ │
│ │             series_id <uvarint>          │ │
│ ├──────────────────────────────────────────┤ │
│ │             metric_type <1b>             │ │
│ ├──────────────────────────────────────────┤ │
│ │             num_fields <uvarint>         │ │
│ ├───────────────────────┬──────────────────┤ │
│ │ len(name_1) <uvarint> │  name_1 <bytes>  │ │
│ ├───────────────────────┼──────────────────┤ │
│ │ len(val_1) <uvarint>  │  val_1 <bytes>   │ │
│ ├───────────────────────┴──────────────────┤ │
│ │                 . . .                    │ │
│ ├───────────────────────┬──────────────────┤ │
│ │ len(name_n) <uvarint> │  name_n <bytes>  │ │
│ ├───────────────────────┼──────────────────┤ │
│ │ len(val_n) <uvarint>  │  val_n <bytes>   │ │
│ └───────────────────────┴──────────────────┘ │
│                   . . .                      │
└──────────────────────────────────────────────┘

Here the name and values would be ["unit", "the unit", "help", "the help"] and num_fields being 2. This way you don't need to do the complex size calculation and skipping as well. The skipping of fields can also be as simple as reading the name and skipping the value by readind only the length of the value. This is fine because we will rarely see fields that we do not recognize, so there is no performance penalty.

Reading of WAL on startup can be done in a separate PR as long as we are clear that it is easily readable into the memory.

Regarding the checkpointing logic, we need to brainstorm more idea to get rid of duplicates. It is possible that we need to modify the storage format for metadata depending on how checkpointing is to be done (for example a timestamp with the metadata), so it should be done in this PR (or merge this PR and do changes to the format before the next release which is also acceptable, but I would not do that because the next release is planned near the end of June and I will be unable to review for the reasons stated later). I will parallely think of ways, but note that I am on PTO from tomorrow until June 17ish (worst case I am back on June 20). While this is behind an experimental flag, I prefer to not make breaking changes to the formats when possible, because they are painful to handle and leaves behind tech debt.

tpaschalis · 2022-06-14T15:12:25Z

cc @bwplotka as the co-maintainer of TSDB and until Ganesh is back from PTO, do you have any feedback on the proposed changes in the WAL format, the TSDB code or the recent benchmarks?

cstyan

Just some minor comments on top of what we talked about during our call today.

scrape/scrape.go

tsdb/head_append.go

tpaschalis · 2022-06-17T13:19:12Z

Hey team :) The newly proposed storage format is implemented in 208a4bb and seems like it's working out.

The encoding/decoding logic is simpler, we're able to correctly skip over unknown fields without assuming anything about their size or be concerned about their ordering. Let me know what you think!

scrape/scrape.go

cstyan · 2022-06-17T17:39:41Z

tsdb/record/record.go

+const (
+	UnknownMT      MetricType = 0
+	Counter        MetricType = 1
+	Gauge          MetricType = 2
+	Histogram      MetricType = 3
+	GaugeHistogram MetricType = 4
+	Summary        MetricType = 5
+	Info           MetricType = 6
+	Stateset       MetricType = 7
+)


LGTM assuming these are the current metric types and how/where we want to encode them for source of truth. @codesome ?

Could you provide some more context on this please? I'm not sure what you meant 😕

I just happen to know that these are the types present in the client_golang. (Edit: also the ones recognized by the scrape)

tpaschalis · 2022-06-17T20:23:40Z

FWIW, after a recent discussion with @cstyan, I've run a benchmark to verify the growth in WAL size and replay duration by quickly patching in support for reading back the metadata records in tpaschalis@abba445. I hope this is helpful to understand the worst-case when someone would opt-in to the feature.

The benchmark is more on the heavyweight worst-case ceiling side; it simulates adding 50k records like the following in the scrape loop's appender

# TYPE metric_a gauge
# UNIT a custom unit
# HELP metric_a help text
metric_a{foo=N ,bar=100N} 1

What I observed was

a 7% increase in WAL replay time
a 42% increase in WAL size (makes sense since we have 50k sample records and 50k metadata records)
a 5% increase in memory allocations when replaying the WAL
a 12% increase in allocated memory, which should be in line with the the new data that must he held in memory after the replay

name                                                                                         old time/op      new time/op      delta
WALReplaying_WithWithoutMetadata/benchmarking_tsdb_open_with_appendMetadataToWAL_set_to:_-8       380ms ±17%       406ms ±20%   +6.82%  (p=0.008 n=30+26)

name                                                                                         old wal-size-MB  new wal-size-MB  delta
WALReplaying_WithWithoutMetadata/benchmarking_tsdb_open_with_appendMetadataToWAL_set_to:_-8        3.05 ± 0%        4.33 ± 0%  +41.94%  (p=0.000 n=30+30)

name                                                                                         old alloc/op     new alloc/op     delta
WALReplaying_WithWithoutMetadata/benchmarking_tsdb_open_with_appendMetadataToWAL_set_to:_-8       111MB ± 1%       125MB ± 0%  +12.08%  (p=0.000 n=30+29)

name                                                                                         old allocs/op    new allocs/op    delta
WALReplaying_WithWithoutMetadata/benchmarking_tsdb_open_with_appendMetadataToWAL_set_to:_-8       1.02M ± 0%       1.07M ± 0%   +4.90%  (p=0.000 n=30+29)

codesome

Nice work! I believe tests for appending metadata and verifying them in the WAL and its replay will follow this PR, so I won't block this PR on that. There are only a couple of things that I would like to see being done in this PR, one about fixing append semantics and optimising memory usage for the checkpointing.

I have only reviewed the TSDB related stuff (tsdb/* storage/*) and haven't reviewed the scraping or any other part.

tsdb/record/record.go

codesome · 2022-06-22T06:47:29Z

tsdb/wal/checkpoint.go

+			repl := 0
+			for _, m := range metadata {
+				if keep(m.Ref) {
+					latestMetadataMap[m.Ref] = m


I am a little unsure about doing it this way. Since there is a few % extra memory usage in general with metadata, holding this map will increase the peak memory requied a bit more.

What I would suggest is having another keep method, let's call it keepMetadata(ref, metadata). Like how keep() checks if the series exists, keepMetadata(..) will check if the series exists, and if it does, it will check if the metadata matches the one present in the memory. That way we don't have to keep this map around.

codesome · 2022-06-22T06:59:20Z

tsdb/head.go


 	ref  chunks.HeadSeriesRef
 	lset labels.Labels
+	meta metadata.Metadata


A note for future (no action required right now): Does this have any noticeable memory impact when feature is disabled? Should it be a pointer instead?

By my quick count this struct was previously 22 8-byte words, and adding meta would take it to 28 (+ 3 strings).
For 10 million series this adds 60MB, typically 120MB in memory usage before GC.

(Not counting the contents of the strings, which I hope are shared across all series with the same values)

which I hope are shared across all series with the same values

@tpaschalis is this the case? The only thing that I see can be share is the type and unit

Sorry I went from your question "when feature is disabled" to the values, which obviously only exist when the feature is enabled. However it's valid to ask the memory impact in that case too.

No, the UNIT and HELP strings are not currently shared between series.

Should we look into interning them? We have a PR for being able to share the interner used by remote_write with the scrape loop, but admittedly it has made little progress.

(of course, any comments to the other PR or ideas here are more than welcome 😄 )

Just to clarify: where the same metric appears many times in a single scrape (example), the help string will only be parsed once; are you sure that value isn't shared?
Interning will help further, sharing across different scrapes, but has some concurrency implications.

Oh I see what you mean. We aren't changing anything about the way the HELP or UNIT strings are parsed; they are indeed shared between all series with the same metric name, but not across scrapes yet.

tsdb/head_append.go

tsdb/record/record.go

bboreham · 2022-06-28T16:10:29Z

tsdb/head_append.go

+	if hasNewMetadata {
+		a.metadata = append(a.metadata, record.RefMetadata{
+			Ref:  s.ref,
+			Type: record.GetMetricType(meta.Type),


Feels weird to me that we do a text->number conversion here inside tsdb.
Why not do it at scrape time where we do all the other text parsing?

Thanks for bringing this up. This is relevant to your next comment as well.

The enum-like MetricType was added as an encoding/decoding-specific optimisation after this suggestion and it didn't occur to me to make this conversion from the scrape loop.

One drawback I see is that the textparse.MetricType is shared between many structs, so there would a bigger change to review around the scrape loop.
For example, the current MetricType implementation of how metadata arrives to the remote_write would either need an extra conversion, or it would need to propagate the enum-type correctly, resulting in even more changes.

bboreham · 2022-06-28T16:19:59Z

model/metadata/metadata.go

+
+// Metadata stores a series' metadata information.
+type Metadata struct {
+	Type textparse.MetricType


Why is this string-like rather than enum-like?
I can see it was a string previously, but now you are introducing an enum it seems unnatural to have to convert it back to a string to populate this field.

Tied to the previous comment; mainly for backwards-compatibility and not having to touch more files in an already-big PR.

I can produce a commit with how using the enum here would look and get back to you for a second round.

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

When we're setting metadata entries in the scrapeCace, we're using the p.Help(), p.Unit(), p.Type() helpers, which retrieve the series name and use it as the cache key. When checking for cache entries though, we used p.Series() as the key, which included the metric name _with_ its labels. That meant that we were never actually hitting the cache. We're fixing this by utiling the __name__ internal label for correctly getting the cache entries after they've been set by setHelp(), setType() or setUnit(). Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

tpaschalis · 2022-08-29T08:15:40Z

Hey there 👋 @bwplotka @codesome Could I ask for a final review here so we can start working towards the remote_write part as well? ^^

The readability fixes are in, let me know if there's anything else!

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

bwplotka

Amazing work, thank you! 💪🏽

robskillington · 2022-09-28T21:51:41Z

Just saw this was merged, congrats and great work.

cstyan reviewed Feb 19, 2022

View reviewed changes

tpaschalis marked this pull request as ready for review March 10, 2022 12:39

tpaschalis requested review from bwplotka, codesome, csmarchbanks and tomwilkie as code owners March 10, 2022 12:39

harkishen reviewed Mar 14, 2022

View reviewed changes

tsdb/docs/format/wal.md Outdated Show resolved Hide resolved

tpaschalis force-pushed the wal-store-metadata-2 branch from 0214a62 to 9908337 Compare March 30, 2022 12:24

bwplotka reviewed Apr 19, 2022

View reviewed changes

tpaschalis force-pushed the wal-store-metadata-2 branch from 3e2dd46 to 4090e9c Compare April 20, 2022 13:12

paologallinaharbur mentioned this pull request May 11, 2022

Metric type inferred override newrelic/nri-prometheus#294

Closed

tpaschalis force-pushed the wal-store-metadata-2 branch from 862801f to e9ccc51 Compare May 23, 2022 12:11

codesome reviewed Jun 7, 2022

View reviewed changes

tpaschalis force-pushed the wal-store-metadata-2 branch from 5e2449c to 7a503e2 Compare June 14, 2022 12:05

cstyan reviewed Jun 15, 2022

View reviewed changes

scrape/scrape.go Outdated Show resolved Hide resolved

scrape/scrape.go Outdated Show resolved Hide resolved

tsdb/head_append.go Outdated Show resolved Hide resolved

cstyan reviewed Jun 17, 2022

View reviewed changes

scrape/scrape.go Outdated Show resolved Hide resolved

cstyan reviewed Jun 17, 2022

View reviewed changes

codesome reviewed Jun 22, 2022

View reviewed changes

bboreham reviewed Jun 28, 2022

View reviewed changes

tpaschalis added 20 commits August 22, 2022 15:16

Append metadata to the WAL

bca3c09

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

Remove extra whitespace; Reword some docstrings and comments

76d9ff7

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

Use RLock() for hasNewMetadata check

41e5797

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

Use single byte for metric type in RefMetadata

bbbee01

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

Update proposed WAL format for single-byte type metadata

64e2981

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

Address first round of review comments

57e5988

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

Amend description of metadata in wal.md

5ad8142

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

Put feature behind a feature flag

fed9ea0

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

Reorder WAL format document

4fb88f7

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

Fix CR comments

df73f1b

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

Extract logic about changing metadata in an anonymous function

2c4020f

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

Implement new proposed WAL format and amend relevant tests

ba32907

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

Use 'const' for metadata field names

52a2b4f

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

Apply metadata to head memSeries in Commit, not in AppendMetadata

8100e9b

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

Add docstring and rename extracted helper in scrape.go

54db4f2

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

Fix review comments around TestMetadata* tests

652251f

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

Rebase with merged TSDB changes; fix duplicate definitions after rebase

6bf23fa

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

Remove leftover changes on db_test.go

1330e42

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

Rename feature flag

044909e

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

tpaschalis force-pushed the wal-store-metadata-2 branch from 444d108 to 044909e Compare August 22, 2022 12:22

Simplify updateMetadata helper function

3bc789e

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

Remove extra newline

4e3b06a

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

bwplotka approved these changes Aug 31, 2022

View reviewed changes

bwplotka merged commit 5a8e202 into prometheus:main Aug 31, 2022

tpaschalis mentioned this pull request Nov 29, 2022

Propagate metadata from the WAL to remote_write #11640

Closed

4 tasks

bboreham mentioned this pull request Apr 12, 2023

Manually stored metrics do not have metadata grafana/alloy#547

Open

roidelapluie mentioned this pull request Sep 26, 2023

Add metadata to WAL, expose to WAL reader and send with remote write #7771

Closed

1 task

Conversation

tpaschalis commented Feb 17, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

TODO

Prior Art

Notes

Uh oh!

cstyan left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tpaschalis Feb 22, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

beorn7 commented Mar 29, 2022

Uh oh!

rfratto commented Apr 12, 2022

Uh oh!

bwplotka commented Apr 19, 2022

Uh oh!

bwplotka left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tpaschalis May 2, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

cstyan commented May 24, 2022

Uh oh!

tpaschalis commented May 26, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codesome commented May 26, 2022

Uh oh!

codesome left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tpaschalis Jun 14, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tpaschalis Jun 14, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

tpaschalis commented Feb 17, 2022 •

edited

Loading

tpaschalis Feb 22, 2022 •

edited

Loading

tpaschalis May 2, 2022 •

edited

Loading

tpaschalis commented May 26, 2022 •

edited

Loading

tpaschalis Jun 14, 2022 •

edited

Loading

tpaschalis Jun 14, 2022 •

edited

Loading

tpaschalis commented Jun 7, 2022 •

edited

Loading

tpaschalis Jun 17, 2022 •

edited

Loading

codesome Jun 22, 2022 •

edited

Loading