Skip to content

metadata: Move caching from scrapeCache.metadata to head.series #17619

@bwplotka

Description

@bwplotka

Proposal

As discussed on https://cloud-native.slack.com/archives/C06B8KJUU2Y/p1763451934229489 and #17436 we would love to have an experiment where we only cache metadata in memory once in memSeries in head, instead of the scrape caches.

Related to #17191

Rationales

  • While type-and-unit-labels feature help with the id of metric and type and unit storage (labels), it's still experimental, it leaks to consumption (i.e PromQL for better or worse) and it does not support metric help.
  • Metadata is still used by Metadata APIs and RW1. Users expressed the need to ensure similar use with RW2 too. For RW2 (and generally better experience on consumption), we need metadata to be be per series not per metric family.
  • metadata-wal-records is expensive and is planned to be deprecated. type-and-unit-labels is one replacement, but has limitations. No other good replacements other than longer term plans with Parquet.
  • We already have logic to store metadata in memSeries, we already pay some cost for extra *metadata.Metadata field. This logic is only for metadata-wal-records as of now.
  • Metadata caching was crucial for detecting changes for metadata-wal-records but it's trivial to detect those on head already if we store things in memSeries, especially with the unified appender refactor(appenderV2)[FINAL PREVIEW]: move to simpler and unified storage.AppenderV2 interface #17610

With all of this, even with persistent metadata one day with Parquet, it would make sense to improve metadata caching. @pipiland2612 already explored some of this path with #17436 (using memSeries metadata, without moving fully), but there will be extra overhead UNLESS we move the metadata fully from the scrape cache to head.

Solution Details

To avoid races (records for series in WAL exists, but metadata is not yet cached in mem), we likely need to store metadata when we create / get memSeries in Append without waiting for the commit, similar to how we store labels.

To derisk, this could be behind a feature flag.

Queue manager in this scenario would use head for metadata (and series!) lookups, no need to decode any records. This would apply to agent mode too.

For current metadata uses (API and RW1), we likely need to build extra mapping map[metricFamilyName]HeadSeriesRef likely in head and likely store mfName on memSeries 🙃 l

  • For Metadata API new flow will be faster, no need to talk to scrape targets.
  • For RW1 same, no need for scrape targets, straight to head potentially.
  • For TargetMetadata API it will be a bit more expensive, but perhaps acceptable, it's not often used (?). We would use map[metricFamilyName]HeadSeriesRef for GetMetadata or scrapeCache.series/scrapeCache.seriesCur for ListMetadata

Long term, metadata per mfName could be replaced (4.0?) to per series and be a bit simpler.

Lot's of unknowns, especially around metadata API use, and how it affects lock contention for memSeries (is it not too heavily accessed), but perhaps worth a try?

cc @krajorama @kgeckhart @bboreham

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions