You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Through the console (or perhaps the CLI?) a user can view metrics for some category of information. For example: "show me the metrics for HTTP endpoint latency. Show me metrics for disk/network usage. etc.
Point to consider: "operator" usage vs "end-user" usage -- each may see different metrics. We will want a different set of ACLs, at bare minimum, even if the Nexus implementation is mechanically similar.
Open question: how many endpoints? what query parameters are exposed? What would be useful for console?
This should trigger a request to the external Nexus API, which itself should be able to make requests to Clickhouse
Presumably, Nexus will act as an ACL validator + proxy to Clickhouse. Hopefully not too much post-processing of data is necessary.
What already exists:
There's machinery around oximeter to collect metrics from services, and store such information within Clickhouse itself. Although we should definitely add more metrics here (see: Upstairs disk stats -> Oximeter crucible#341 as an example), this half of the problem space is considered out-of-scope for this issue.
Since we already have HTTP endpoint latency wired up and dumped into Clickhouse, this may be an easy "first target". For utility, however, user-visible metrics (instance stats, disk/networking metrics, etc) will be high-value targets.
Here's the end-user flow we'd like:
What already exists: