Skip to content

admission: better observability of slot adjustment behavior #92673

@sumeerbhola

Description

@sumeerbhola

There are two things that affect slot adjustment:

  • The interval of CPULoad ticks: This can be either 1ms or 250ms. We used to log every change in this interval, but no longer do that since it was spamming the logs. An alternative would be to increment two cumulative duration metrics representing the cumulative duration spent under each regime.
  • Interaction between increments and decrements: we have observed slow changes in the total slots in the past e.g. https://cockroachlabs.slack.com/archives/C01SRKWGHG8/p1667511459730139?thread_ts=1667418886.450229&cid=C01SRKWGHG8, and been unsure whether it was because we were both increasing and decreasing, or because the increase was rare. Having two counters counting the number of increments and decrements would tell us what happened.

Jira issue: CRDB-21914

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-admission-controlC-enhancementSolution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions