Add Elasticsearch output status reporter by fearful-symmetry · Pull Request #239 · elastic/elastic-agent-shipper

fearful-symmetry · 2023-02-09T22:49:43Z

What does this PR do?

closes #174

This adds a reporting utility to the elasticsearch output, which accepts a callback that updates the unit state if the output fails for a given number of seconds.

I've tested this, but between #240 and elastic/beats#34319 it's a tad hard to test.

Checklist

My code follows the style guidelines of this project
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
I have made corresponding change to the default configuration files
I have added tests that prove my fix is effective or that my feature works
I have added an entry in CHANGELOG.md or CHANGELOG-developer.md.

mergify · 2023-02-09T22:50:17Z

This pull request does not have a backport label.
If this is a bug or security fix, could you label this PR @fearful-symmetry? 🙏.
For such, you'll need to label your PR with:

The upcoming major version of the Elastic Stack
The upcoming minor version of the Elastic Stack (if you're not pushing a breaking change)

To fixup this pull request, you need to add the backport labels for the needed
branches, such as:

backport-v8./d.0 is the label to automatically backport to the 8./d branch. /d is the digit

elasticmachine · 2023-02-09T23:07:45Z

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS

Expand to view the summary

Build stats

Start Time: 2023-02-14T19:22:36.617+0000
Duration: 16 min 39 sec

❕ Flaky test report

No test was executed to be analysed.

🤖 GitHub comments

Expand to view the GitHub comments

To re-run your PR in the CI, just comment with:

/test : Re-trigger the build.

output/elasticsearch/watcher.go

faec · 2023-02-13T18:37:45Z

output/elasticsearch/output.go

+					es.healthWatcher.Fail(err.Error())
 					es.logger.Errorf("couldn't add to bulk index request: %v", err)
 					// This event couldn't be attempted, so mark it as finished.
-					batch.Done(1)


Why remove the Done call here? IIRC it was needed for bookkeeping or else the batch won't be fully acknowledged.

ah, probably removed that by accident...

faec · 2023-02-13T18:41:11Z

output/elasticsearch/config_test.go

+	time.Sleep(time.Millisecond * 100)
+	watcher.Fail("simulated failure")
+	// should fail
+	time.Sleep(time.Millisecond * 600)


The sleeps in these tests are some heavy costs to impose on every unit test run. Can we instead make the time callback a parameter of the watcher? Typical use could pass in time.Now (or maybe that could be the default) while unit tests could pass in a mocked timer so it can be tested deterministically without any delay.

So, I tinkered with this for a bit, but after I started running into race issues between the tests and the main watch loop, I decided it would be easier to just shift things around to make the sleeps faster, which is hopefully good enough.

Mmm I see what you mean but it makes me sad. Probably the Right thing to do here is still to pass in a mockable helper to generate timer channels or something so we can make it deterministic, but that would require a lot of revisions for something that already mostly works. Maybe we can do a more robust generic helper when we have more components with health to report; I'll approve this one for now :-)

Yah, agreed, I'm not a fan of it either, and I experimented with adding a few helpers/callbacks, but it just felt like the added code complexity just wasn't worth making the tests fully deterministic, particularly after shortening the sleeps in the test.

fearful-symmetry · 2023-02-14T18:57:39Z

/test

faec

lgtm pending some minor tweaks

output/elasticsearch/watcher.go

faec · 2023-02-14T19:03:20Z

output/elasticsearch/watcher.go

+		select {
+		case <-ctx.Done():
+			return
+		default:


How about replace default with <-time.After(hw.waitInterval) and then remove the sleep at the end of the loop? Similar effect but then sleeps still respect context cancellation.

Oh, good idea!

faec · 2023-02-14T19:08:18Z

output/elasticsearch/config_test.go

+	time.Sleep(time.Millisecond * 100)
+	watcher.Fail("simulated failure")
+	// should fail
+	time.Sleep(time.Millisecond * 600)


Mmm I see what you mean but it makes me sad. Probably the Right thing to do here is still to pass in a mockable helper to generate timer channels or something so we can make it deterministic, but that would require a lot of revisions for something that already mostly works. Maybe we can do a more robust generic helper when we have more components with health to report; I'll approve this one for now :-)

fearful-symmetry added 4 commits February 9, 2023 13:38

make changes to output

3b5158f

Merge remote-tracking branch 'upstream/main' into es-output-report

1bdf288

fix merge

3da2164

first pass at a watcher for ES status

7214225

fearful-symmetry added enhancement New feature or request Team:Elastic-Agent Label for the Agent team labels Feb 9, 2023

fearful-symmetry requested a review from a team as a code owner February 9, 2023 22:49

fearful-symmetry self-assigned this Feb 9, 2023

fearful-symmetry requested review from leehinman and rdner and removed request for a team February 9, 2023 22:49

fix up

05db482

fearful-symmetry requested a review from faec February 10, 2023 18:33

faec reviewed Feb 13, 2023

View reviewed changes

tinker with tests, typos

b1b0532

faec approved these changes Feb 14, 2023

View reviewed changes

change sleep, typo

d4a31b4

fearful-symmetry merged commit 58a1029 into elastic:main Feb 14, 2023

cmacknz mentioned this pull request Apr 19, 2023

The Elasticsearch output should not report itself as degraded based only on the time between events #301

Closed

Conversation

fearful-symmetry commented Feb 9, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Checklist

Uh oh!

mergify bot commented Feb 9, 2023

Uh oh!

elasticmachine commented Feb 9, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

💚 Build Succeeded

Build stats

❕ Flaky test report

🤖 GitHub comments

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

fearful-symmetry commented Feb 14, 2023

Uh oh!

faec left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

fearful-symmetry commented Feb 9, 2023 •

edited

Loading

elasticmachine commented Feb 9, 2023 •

edited

Loading