Skip to content

kvserver: export remaining snapshot bytes  #85528

@kvoli

Description

@kvoli

Summary
Each snapshot may be a different size, it would be beneficial to track the total remaining snapshot bytes that are queued and in progress on a store's receiver snapshot semaphore. Additionally the remaining bytes that are queued on a store's sender snapshot semaphore.

Note we currently track the current reservations in bytes, which is the current size of the snapshot(s) being processed on a store capacity.reserved.

Solution

The solution is to add four additional exported metrics, with the last two optional and pending how useful they are:

  1. range.snapshots.queued-rcvd: a gauge tracking the sum of all snapshot bytes that are currently queued on a store's receive queue, however have not gotten a reservation (begun processing).
  2. range.snapshots.queued-send: a gauge tracking the sum of all snapshot bytes that are currently queued on a store's send queue, however have not begun gotten a reservation (begun processing).
  3. range.snapshots.pending-rcvd: a gauge tracking the sum of all snapshot bytes that remain on a store's receiving side, for snapshots that have acquired a reservation. This could be updated more frequently, to track the "remaining bytes" i.e. reservation - processed.
  4. range.snapshots.pending-send: a gauge tracking the sum of all snapshot bytes that remain on a store's sending side, for snapshots that have acquired a reservation. Similar to above, this is tracking the remaining bytes to be sent.

Context

(3) and (4) may not present much material benefit, as snapshots should in most cases be processed in under 16 (512mb/32mb/s) seconds. Whilst the default metric update interval is 10 seconds, In cases where the snapshot rate is set lower, it may provide utility - however the existing capacity.reserved metric, tracking the total (unprocessed + processed) in progress snapshot bytes may be more appropriate. This issue leaves them as optional.

related PR, for count rather than bytes: #84947

cc @AlexTalks

Jira issue: CRDB-18293

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-kv-distributionRelating to rebalancing and leasing.C-enhancementSolution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)T-kvKV Team

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions