Skip to content

add prometheus#403

Merged
marsishandsome merged 6 commits intomasterfrom
send-metrics-to-Prometheus
Jan 28, 2020
Merged

add prometheus#403
marsishandsome merged 6 commits intomasterfrom
send-metrics-to-Prometheus

Conversation

@marsishandsome
Copy link
Contributor

@marsishandsome marsishandsome commented Jan 27, 2020

what this PR does

  1. add prometheus-cpp as a third-party library
  2. start a thread to send data from system.metrics, system.events and system.asynchronous_metrics tables to prometheus
  3. support both pull mode and push mode

pull mode

add the following config to enable pull mode

[status]
# Prometheus metrics port, leaves it empty will disable prometheus pull.
metrics_port = 8234

push mode

add the following config to enable pull mode

[status]
# Prometheus pushgateway address, leaves it empty will disable prometheus push.
metrics_addr = "pushgateway:9091"

# Prometheus client push interval in second, min=5, max=120, default=15.
#metrics_interval = 15

how to add new metrics

#include <prometheus/gauge.h>
#include "MetricsPrometheus.h"

auto& gauge_family = prometheus::BuildGauge()
        .Name("time_running_seconds_total")
        .Help("How many seconds is this server running?")
        .Labels({{"label", "value"}})
        .Register(*DB::MetricsPrometheus::getRegistry());

auto& gauge_metric_1 = gauge_family.Add(
        {{"another_label", "value1"}, {"yet_another_label", "value1"}});

auto& gauge_metric_2 = gauge_family.Add(
        {{"another_label", "value2"}, {"yet_another_label", "value2"}});

while(true)
{
    std::this_thread::sleep_for(std::chrono::seconds(1));
    gauge_metric_1.Increment(1);
    gauge_metric_2.Increment(1);
}

how to get tiflash metrics (pull mode)

http://127.0.0.1:8234/metrics

# HELP tiflash_system_profile_events_Query Get from system.metrics, system.events and system.asynchronous_metrics tables
# TYPE tiflash_system_profile_events_Query gauge
tiflash_system_profile_events_Query 0.000000
# HELP tiflash_system_profile_events_SelectQuery Get from system.metrics, system.events and system.asynchronous_metrics tables
# TYPE tiflash_system_profile_events_SelectQuery gauge
tiflash_system_profile_events_SelectQuery 0.000000
# HELP tiflash_system_profile_events_InsertQuery Get from system.metrics, system.events and system.asynchronous_metrics tables
# TYPE tiflash_system_profile_events_InsertQuery gauge
tiflash_system_profile_events_InsertQuery 0.000000
# HELP tiflash_system_profile_events_DeleteQuery Get from system.metrics, system.events and system.asynchronous_metrics tables
# TYPE tiflash_system_profile_events_DeleteQuery gauge
tiflash_system_profile_events_DeleteQuery 0.000000
# HELP tiflash_system_profile_events_FileOpen Get from system.metrics, system.events and system.asynchronous_metrics tables
# TYPE tiflash_system_profile_events_FileOpen gauge
tiflash_system_profile_events_FileOpen 0.000000
# HELP tiflash_system_profile_events_FileOpenFailed Get from system.metrics, system.events and system.asynchronous_metrics tables
# TYPE tiflash_system_profile_events_FileOpenFailed gauge
tiflash_system_profile_events_FileOpenFailed 0.000000
# HELP tiflash_system_profile_events_Seek Get from system.metrics, system.events and system.asynchronous_metrics tables
# TYPE tiflash_system_profile_events_Seek gauge
tiflash_system_profile_events_Seek 0.000000

how to start prometheus

1. git clone git@github.com:marsishandsome/tiflash-docker-compose.git
2. cd tiflash-docker-compose
3. edit config/prometheus.yml, replace `marsishandsome.local` with your local hostname
4. docker-compose up -d

Grafana
http://127.0.0.1:3000/

Prometheus
http://127.0.0.1:9090/

Pushgateway
http://127.0.0.1:9091/

get tiflash metrics on Prometheus

image

get tiflash metrics on Grafana

image

@marsishandsome
Copy link
Contributor Author

/build

@marsishandsome marsishandsome force-pushed the send-metrics-to-Prometheus branch from 53c1a87 to 1666b2c Compare January 27, 2020 00:41
@marsishandsome
Copy link
Contributor Author

/run-integration-tests

@marsishandsome marsishandsome changed the title Send metrics to prometheus add prometheus Jan 27, 2020
@marsishandsome marsishandsome force-pushed the send-metrics-to-Prometheus branch 2 times, most recently from ff74c62 to 3e34604 Compare January 27, 2020 07:01
@marsishandsome marsishandsome force-pushed the send-metrics-to-Prometheus branch from 3e34604 to ad0e8d3 Compare January 27, 2020 07:04
@marsishandsome
Copy link
Contributor Author

/run-integration-tests

@ilovesoup ilovesoup requested review from flowbehappy, solotzg and zanmato1984 and removed request for zanmato1984 January 27, 2020 08:45

auto pos = metricsAddr.find(':', 0);
auto host = metricsAddr.substr(0, pos);
auto port = metricsAddr.substr(pos + 1, metricsAddr.size());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Malformat on this crashes without useful information?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

if (metricsInterval > 0 && registry != nullptr)
break;

if (cond.wait_until(lock, get_next_time(5), [this] { return quit; }))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if a Poco::Timer suits you better here? At least it is simpler.
https://pocoproject.org/docs/Poco.Util.Timer.html

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed


if (gateway != nullptr)
{
auto returnCode = gateway->Push();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

returnCode Var name style?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@marsishandsome marsishandsome force-pushed the send-metrics-to-Prometheus branch from 8b645c2 to be03297 Compare January 28, 2020 01:04
@marsishandsome
Copy link
Contributor Author

/run-integration-tests

@marsishandsome marsishandsome force-pushed the send-metrics-to-Prometheus branch from be03297 to a164c00 Compare January 28, 2020 08:01
metrics_interval = conf.getInt(status_metrics_interval, 15);
if (metrics_interval < 5 || metrics_interval > 120)
{
metrics_interval = 15;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Usually capping should be to its closest number?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also capping might should print a log.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Copy link
Contributor

@ilovesoup ilovesoup left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@marsishandsome marsishandsome force-pushed the send-metrics-to-Prometheus branch from 60e5adb to b2f3860 Compare January 28, 2020 08:34
@marsishandsome
Copy link
Contributor Author

/build

1 similar comment
@marsishandsome
Copy link
Contributor Author

/build

@marsishandsome
Copy link
Contributor Author

/run-integration-tests

2 similar comments
@marsishandsome
Copy link
Contributor Author

/run-integration-tests

@marsishandsome
Copy link
Contributor Author

/run-integration-tests

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants