Log monitoring bulk failures by ycombinator · Pull Request #14356 · elastic/beats

ycombinator · 2019-10-31T17:39:20Z

Resolves #14303.

As reported in #14303, when the Elasticsearch monitoring reporter in libbeat sends a bulk API request to Elasticsearch, and that request fails, the errors are currently swallowed. This is because the actual response code for the bulk API request is 200 OK; the actual errors are embedded in the request's response body.

This PR teaches the Elasticsearch monitoring reporter to parse the bulk API response and log any errors. For the parsing, the same code as the Elasticsearch output is reused.

Testing this PR

Start up Elasticsearch with security enabled. Make sure you know the password for the elastic superuser.

Create a role that grants necessary privileges for managing and writing to metricbeat-* indices.

curl -s -u elastic -H 'Content-Type: application/json' 'http://localhost:9200/_security/role/mb_writer' -d '{ "cluster": [ "monitor", "manage_ilm", "manage_index_templates" ], "indices": [ { "names": [ "metricbeat-*" ], "privileges": [ "all" ] } ] }'

Create a user with the above role.

curl -s -u elastic -H 'Content-Type: application/json' 'http://localhost:9200/_security/user/mb_writer' -d '{ "password": "mb_writer", "roles": [ "mb_writer" ] }'

Build Metricbeat with this PR.

cd $GOPATH/src/github.com/elastic/beats/metricbeat
mage build

Start Metricbeat with monitoring enabled and specifying the credentials of the above user for the elasticsearch output.

./metricbeat -e -E output.elasticsearch.username=mb_writer -E output.elasticsearch.password=mb_writer -E monitoring.enabled=true

Verify that metricbeat-* indices are being created and populated in Elasticsearch but no .monitoring-beats-* indices are being created.
```
curl -s -u elastic 'http://localhost:9200/_cat/indices'
```

Verify that there are warnings in the log like so:

2019-11-01T08:57:43.910-0700    WARN    elasticsearch/client.go:258     monitoring bulk item insert failed (i=0, status=403): {"type":"security_exception","reason":"action [indices:admin/create] is unauthorized for user [mb_writer]"}

houndci-bot · 2019-10-31T17:39:24Z

libbeat/outputs/elasticsearch/client.go

exported function ItemStatus should have comment or be unexported

libbeat/outputs/elasticsearch/json_read.go

houndci-bot · 2019-10-31T17:39:25Z

libbeat/monitoring/report/elasticsearch/client.go

should omit 2nd value from range; this loop is equivalent to for i := range ...

libbeat/outputs/elasticsearch/client.go

elasticmachine · 2019-11-01T16:05:26Z

Pinging @elastic/stack-monitoring (Stack monitoring)

ycombinator · 2019-11-12T03:04:39Z

jenkins, test this

ph

Code is OK to me, but I think we should have some tests added to cover that behavior and especially if the remote system changes his behavior. I don't link how the 200 vs the 403 response code is handled in this scenario.

Looking at existing code, there is currently no unit tests for the ES/reporter and adding that to the existing python system tests might be complicated but still worth investigating.

Also for BulkReadToItems we can surely add a test for it?

ph · 2019-11-12T18:09:28Z

libbeat/outputs/elasticsearch/bulkapi.go

+1 nice change

libbeat/outputs/elasticsearch/client.go

ph

LGTM, we need to find a better way with system test, I think its a problem and we need to have a proposal for that. Maybe a way to use a specific docker-compose file for a set of test.

ycombinator · 2019-11-14T21:04:49Z

Travis CI is green. Jenkins CI failures are unrelated. Merging.

* Log monitoring bulk failures (#14356) * Log monitoring bulk failures * Renaming function * Simplifying type * Removing extraneous second value * Adding godoc comments * Adding CHANGELOG entry * Clarifying log messages * WIP: adding unit test stubs * Fleshing out unit tests * [DOCS] Deprecate central management (#14104) (#14594) * State minimum Go version (#14400) (#14598) * [DOCS] Fix description of rename processor (#14408) (#14600) * Log monitoring bulk failures (#14356) * Log monitoring bulk failures * Renaming function * Simplifying type * Removing extraneous second value * Adding godoc comments * Adding CHANGELOG entry * Clarifying log messages * WIP: adding unit test stubs * Fleshing out unit tests * Fixing up CHANGELOG

* Log monitoring bulk failures * Renaming function * Simplifying type * Removing extraneous second value * Adding godoc comments * Adding CHANGELOG entry * Clarifying log messages * WIP: adding unit test stubs * Fleshing out unit tests

* Log monitoring bulk failures (elastic#14356) * Log monitoring bulk failures * Renaming function * Simplifying type * Removing extraneous second value * Adding godoc comments * Adding CHANGELOG entry * Clarifying log messages * WIP: adding unit test stubs * Fleshing out unit tests * [DOCS] Deprecate central management (elastic#14104) (elastic#14594) * State minimum Go version (elastic#14400) (elastic#14598) * [DOCS] Fix description of rename processor (elastic#14408) (elastic#14600) * Log monitoring bulk failures (elastic#14356) * Log monitoring bulk failures * Renaming function * Simplifying type * Removing extraneous second value * Adding godoc comments * Adding CHANGELOG entry * Clarifying log messages * WIP: adding unit test stubs * Fleshing out unit tests * Fixing up CHANGELOG

houndci-bot reviewed Oct 31, 2019

View reviewed changes

libbeat/outputs/elasticsearch/client.go Outdated Show resolved Hide resolved

libbeat/outputs/elasticsearch/client.go Outdated Show resolved Hide resolved

ycombinator mentioned this pull request Oct 31, 2019

Handle bulk request results in monitoring #14354

Closed

ycombinator marked this pull request as ready for review November 1, 2019 16:04

ycombinator requested review from cwurm and ph November 1, 2019 16:05

ycombinator added bug libbeat Feature:Stack Monitoring v7.5.1 v7.6.0 v8.0.0 labels Nov 1, 2019

ph suggested changes Nov 12, 2019

View reviewed changes

ph approved these changes Nov 13, 2019

View reviewed changes

ycombinator added 9 commits November 13, 2019 07:53

Log monitoring bulk failures

1006796

Renaming function

93e7e9f

Simplifying type

102eb9f

Removing extraneous second value

8de27d5

Adding godoc comments

febfb9a

Adding CHANGELOG entry

6e59a0d

Clarifying log messages

a886064

WIP: adding unit test stubs

732c075

Fleshing out unit tests

36d60bb

ycombinator merged commit a9aff6f into elastic:master Nov 14, 2019

This was referenced Nov 14, 2019

[7.x] Log monitoring bulk failures (#14356) #14526

Merged

[7.5] Log monitoring bulk failures (#14356) #14527

Merged

ycombinator deleted the lb-mon-log-bulk-failures branch December 25, 2019 11:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Log monitoring bulk failures#14356

Log monitoring bulk failures#14356
ycombinator merged 9 commits intoelastic:masterfrom
ycombinator:lb-mon-log-bulk-failures

ycombinator commented Oct 31, 2019 •

edited

Loading

Uh oh!

houndci-bot Oct 31, 2019

Uh oh!

Uh oh!

Uh oh!

Uh oh!

houndci-bot Oct 31, 2019

Uh oh!

Uh oh!

Uh oh!

elasticmachine commented Nov 1, 2019

Uh oh!

ycombinator commented Nov 12, 2019

Uh oh!

ph left a comment

Uh oh!

ph Nov 12, 2019

Uh oh!

Uh oh!

ph left a comment

Uh oh!

ycombinator commented Nov 14, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

ycombinator commented Oct 31, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Testing this PR

Uh oh!

houndci-bot Oct 31, 2019

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

houndci-bot Oct 31, 2019

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

elasticmachine commented Nov 1, 2019

Uh oh!

ycombinator commented Nov 12, 2019

Uh oh!

ph left a comment

Choose a reason for hiding this comment

Uh oh!

ph Nov 12, 2019

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ph left a comment

Choose a reason for hiding this comment

Uh oh!

ycombinator commented Nov 14, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

ycombinator commented Oct 31, 2019 •

edited

Loading