Update Metricbeat, Filebeat, libbeat with elastic-agent V2 support#32673
Update Metricbeat, Filebeat, libbeat with elastic-agent V2 support#32673fearful-symmetry merged 45 commits intoelastic:feature-arch-v2from
Conversation
|
Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane) |
👍
I have tested with standalone agent, and it seems to work with metricbeat and filebeat. |
|
I have built beats with this PR and running with the latest Elastic Agent V2 architecture I am seeing the |
|
@blakerouse are there any conditions where we might expect the |
|
Actually, if it failed at that particular line, it's more likely that the |
|
Added a few fixes, including one I suspect might be the issue. |
|
@fearful-symmetry I am just using the default configuration that is shipped with Elastic Agent. Here is the configuration I am using: |
| // I'm assuming that a state STOPPED just tells us to shut down the entire beat, | ||
| // as such we don't really care about updating via a particular unit | ||
| if state == client.UnitStateStopped { | ||
| if state == client.UnitStateStopped || state == client.UnitStateStopping { |
There was a problem hiding this comment.
From Expected() you will either receive HEALTHY or STOPPED that is the only 2 expected states that the Elastic Agent will tell you for a Unit. Of course the actual state of the unit can be of any type.
When you get client.UnitstateStopped it is telling that only that specific unit should be stopped. That does not mean the whole beat should be stopped. Example is when Elastic Agent is running 2 filestream inputs, and then only 1 is removed. The remove unit would be changed to client.UnitStateStopped where the other unit would remain as client.UnitStateHealthy. Elastic Agent will still report that removed unit as existing until you report client.UnitStateStopped through unit.UpdateState. Once that is done the an update from the client to fully remove the unit will occur.
There was a problem hiding this comment.
From Expected() you will either receive HEALTHY or STOPPED that is the only 2 expected states that the Elastic Agent will tell you for a Unit.
Ah, thanks, I was a bit confused by the logic there.
There was a problem hiding this comment.
So UnitStateStopping is only used for clients to update state to the server?
There was a problem hiding this comment.
UnitStateStopping is an information only state. It tells the Elastic Agent that work is being done to stop the unit. If the unit can be stopped immediately you can just report UnitStateStopped instantly (UnitStateStopping is not required to be reported).
Once UnitStateStopped is reported from Expected() it is required that the unit report as UnitStateStopped once it has completely stopped.
|
Also, the two simultaneous truths of "not every config will have data streams" and "the protobuf structs are full of pointers" didn't quite merge in my brain, and I've been testing for the maximal case of "all the config values". Gonna add some more pointer-safe wrappers and tests for this. |
|
Yah, I'm definitely paranoid about all the |
|
I rebuilt with c942273 and I am still seeing a similar error: |
|
Yup, working on it now @blakerouse , just kinda pushing to the PR as I go |
|
Okay, now we should be good. |
|
Alright, rather than trying to set the |
cmacknz
left a comment
There was a problem hiding this comment.
One place left that we aren't using the Get field accessors that should likely be fixed, other than that I think we can likely merge this.
The tests are passing and various manual tests have confirmed Beats will run under agent using this branch. We can continue with bug fixes in follow up PRs.
* Update Metricbeat, Filebeat, libbeat with elastic-agent V2 support (#32673) * basic framework * continued tinkering * move away from ast code, use a struct * get metricbeat working, starting on filebeat * add notice update * add basic config register * move over processors to individual beats * remove comments * start to integrate V2 client changes * finishing touches * lint * cleanup merge * remove V1 controller * stil tinkering with linter * still fixing linter * plz linter * fmt x-pack files * notice update * fix output test * refactor stop functions, refactor tests, some misc cleanup * fix client version string * add devguide * linter * expand filebeat test * cleanup test * fix docs, add tests, debuggin * add signal handler * fix mutex issue in register * Fix osquerybeat configuration for V2 * clean up component registration * spelling * remove workaround for filebeat types * try to fix filebeat tests * add nil checks, fix test, fix unit stop * continue tinkering with nil type checks * add test for missing config datastreams, clean up nil handling * change nil protections, use getter methods * fix config access in output code Co-authored-by: Aleksandr Maus <aleksandr.maus@elastic.co> * V2 packetbeat support (#33041) * first attempt at auditbeat support * add license header * initial packetbeat support * fix bad branch * cleanup * typo in comment * clean up, move around files * add new processors to streams * First pass at auditbeat support (#33026) * first attempt at auditbeat support * add license header * cleanup * move files around * Add heartbeat support for V2 (#33157) * add v2 config * fix name * fix doc * fix go.mod * fix unchecked stream_id * fix unchecked stream_id (#33335) * Update elastic-agent-libs for output panic fix (#33336) * Fix errors for non-synth capable instances (#33310) Fixes #32694 by making sure we use the lightweight wrapper code always when monitors cannot be initialized. This also fixes an unrelated bug, where errors attached to non-summary events would not be indexed. * [Automation] Update elastic stack version to 8.6.0-5a8d757d for testing (#33323) Co-authored-by: apmmachine <infra-root-apmmachine@elastic.co> * add pid awareness to file locking (#33169) * add pid awareness to file locking * cleanup, logic for handling restarts with the same PID * add zombie-state awareness * fix file naming * add retry for unlock * was confused by unlock code, fix, cleanup * update notice * fix race with file creation, update deps * clean up tests, spelling * hack for cgo * add lic headers * notice * try to fix windows issues * fix typos * small fixes * use exclusive locks * remove feature to start with a specially named pidfile * clean up some error handling, fix test cleanup * forgot changelog * Fix sample config in log rotation docs (#33306) * Add banner to deprecate functionbeat (#33297) * fix unchecked stream_id * packetbeat/protos/dns: clean up package (#33286) * avoid magic numbers * fix hashableDNSTuple size and offsets * avoid use of String and Error methods in formatted print calls * remove redundant conversions * quieten linter * use plugin-owned logp.Logger * update elastic-agent-libs * Revert "fix unchecked stream_id" This reverts commit 26ef6da. * [Automation] Update elastic stack version to 8.6.0-40086bc7 for testing (#33339) Co-authored-by: apmmachine <infra-root-apmmachine@elastic.co> Co-authored-by: Andrew Cholakian <andrewvc@elastic.co> Co-authored-by: apmmachine <58790750+apmmachine@users.noreply.github.com> Co-authored-by: apmmachine <infra-root-apmmachine@elastic.co> Co-authored-by: Jaime Soriano Pastor <jaime.soriano@elastic.co> Co-authored-by: DeDe Morton <dede.morton@elastic.co> Co-authored-by: Dan Kortschak <90160302+efd6@users.noreply.github.com> * update elastic-agent-client (#33552) Co-authored-by: Aleksandr Maus <aleksandr.maus@elastic.co> Co-authored-by: Andrew Cholakian <andrewvc@elastic.co> Co-authored-by: apmmachine <58790750+apmmachine@users.noreply.github.com> Co-authored-by: apmmachine <infra-root-apmmachine@elastic.co> Co-authored-by: Jaime Soriano Pastor <jaime.soriano@elastic.co> Co-authored-by: DeDe Morton <dede.morton@elastic.co> Co-authored-by: Dan Kortschak <90160302+efd6@users.noreply.github.com>
* Update Metricbeat, Filebeat, libbeat with elastic-agent V2 support (#32673) * basic framework * continued tinkering * move away from ast code, use a struct * get metricbeat working, starting on filebeat * add notice update * add basic config register * move over processors to individual beats * remove comments * start to integrate V2 client changes * finishing touches * lint * cleanup merge * remove V1 controller * stil tinkering with linter * still fixing linter * plz linter * fmt x-pack files * notice update * fix output test * refactor stop functions, refactor tests, some misc cleanup * fix client version string * add devguide * linter * expand filebeat test * cleanup test * fix docs, add tests, debuggin * add signal handler * fix mutex issue in register * Fix osquerybeat configuration for V2 * clean up component registration * spelling * remove workaround for filebeat types * try to fix filebeat tests * add nil checks, fix test, fix unit stop * continue tinkering with nil type checks * add test for missing config datastreams, clean up nil handling * change nil protections, use getter methods * fix config access in output code Co-authored-by: Aleksandr Maus <aleksandr.maus@elastic.co> * V2 packetbeat support (#33041) * first attempt at auditbeat support * add license header * initial packetbeat support * fix bad branch * cleanup * typo in comment * clean up, move around files * add new processors to streams * First pass at auditbeat support (#33026) * first attempt at auditbeat support * add license header * cleanup * move files around * Add heartbeat support for V2 (#33157) * add v2 config * fix name * fix doc * fix go.mod * fix unchecked stream_id * fix unchecked stream_id (#33335) * Update elastic-agent-libs for output panic fix (#33336) * Fix errors for non-synth capable instances (#33310) Fixes #32694 by making sure we use the lightweight wrapper code always when monitors cannot be initialized. This also fixes an unrelated bug, where errors attached to non-summary events would not be indexed. * [Automation] Update elastic stack version to 8.6.0-5a8d757d for testing (#33323) Co-authored-by: apmmachine <infra-root-apmmachine@elastic.co> * add pid awareness to file locking (#33169) * add pid awareness to file locking * cleanup, logic for handling restarts with the same PID * add zombie-state awareness * fix file naming * add retry for unlock * was confused by unlock code, fix, cleanup * update notice * fix race with file creation, update deps * clean up tests, spelling * hack for cgo * add lic headers * notice * try to fix windows issues * fix typos * small fixes * use exclusive locks * remove feature to start with a specially named pidfile * clean up some error handling, fix test cleanup * forgot changelog * Fix sample config in log rotation docs (#33306) * Add banner to deprecate functionbeat (#33297) * fix unchecked stream_id * packetbeat/protos/dns: clean up package (#33286) * avoid magic numbers * fix hashableDNSTuple size and offsets * avoid use of String and Error methods in formatted print calls * remove redundant conversions * quieten linter * use plugin-owned logp.Logger * update elastic-agent-libs * Revert "fix unchecked stream_id" This reverts commit 26ef6da. * [Automation] Update elastic stack version to 8.6.0-40086bc7 for testing (#33339) Co-authored-by: apmmachine <infra-root-apmmachine@elastic.co> Co-authored-by: Andrew Cholakian <andrewvc@elastic.co> Co-authored-by: apmmachine <58790750+apmmachine@users.noreply.github.com> Co-authored-by: apmmachine <infra-root-apmmachine@elastic.co> Co-authored-by: Jaime Soriano Pastor <jaime.soriano@elastic.co> Co-authored-by: DeDe Morton <dede.morton@elastic.co> Co-authored-by: Dan Kortschak <90160302+efd6@users.noreply.github.com> * update elastic-agent-client (#33552) Co-authored-by: Aleksandr Maus <aleksandr.maus@elastic.co> Co-authored-by: Andrew Cholakian <andrewvc@elastic.co> Co-authored-by: apmmachine <58790750+apmmachine@users.noreply.github.com> Co-authored-by: apmmachine <infra-root-apmmachine@elastic.co> Co-authored-by: Jaime Soriano Pastor <jaime.soriano@elastic.co> Co-authored-by: DeDe Morton <dede.morton@elastic.co> Co-authored-by: Dan Kortschak <90160302+efd6@users.noreply.github.com>
What does this PR do?
This PR has a few features:
What are the components of this PR?
libbeat/cfgfile/cfgfile.goThis prevents the beats from reading a config file when they start up in remove management mode.
beat-specific changes
metricbeat/beater/metricbeat.go,x-pack/metricbeat/cmd/agent.go, etc.These changes register the reloader interfaces, and define beat-specific config transformations that act as a compatiblity shim between the V2 config structure and pre-existing beats config
x-pack/libbeat/management/*The actual V2 client code that reloads beats and accepts commands from a V2 control server.
x-pack/libbeat/management/tests/*An integration test suite that initializes an entire beat, creates a mock elastic-agent server, lets the beat write any metrics to a logfile, then verifies the metrics, fields, and configs are expected.
What's missing in this PR
Due to a combination of time and knowledge constraints, there are a number of things this PR doesn't do with regards to V2 support:
InjectHeadersRulethat will modify the header config of an elasticsearch output. These headers were obtained via anAgentInfostruct that currently exists in V2, but it doesn't support aHeadersfield.Checklist
CHANGELOG.next.asciidocorCHANGELOG-developer.next.asciidoc.Author's Checklist
How to test this PR locally
Check out the
testsdirectory inx-pack/libbeat/management. Each sub-directory contains a number of folders with integration tests for each beat. Note that until elastic/elastic-agent#850 is merged, this won't work against a real agent, and the mock server in the test suite will be required to actually test the V2 client code.