Refactor monitoring to make the shipper more beats-like, and compatible with Agent#289
Conversation
|
This pull request does not have a backport label.
To fixup this pull request, you need to add the backport labels for the needed
|
|
Whoop, go linter doesn't like my go.mod replace line for elastic/elastic-agent-system-metrics#80 |
|
Okay, just noticed a major problem with this while testing. The shipper is built on the assumption that any of its sub-components can be arbitrarily stopped and start, but the monitoring code used is mostly global, and will usually panic if you try to re-register or update something. Will start poking at a fix. |
faec
left a comment
There was a problem hiding this comment.
Mostly looks good to me (pending the segfaults in the tests which look pretty fixable), but I will hold off approving while you investigate the global state issues. (Considering how big this already is, it might make sense to check in a somewhat-brittle prototype first and do in-place refactors to make it more modular, as long as the failure mode is not too disastrous?)
|
@leehinman you've run into this before on windows, right? |
yeah, that looks a lot like what was happening with the grpc socket. On windows you need to use a named pipe. "github.com/elastic/elastic-agent-libs/api/npipe |
|
Alright, lets see what this does... |
|
/test |
|
Kinda baffled by the CI error: tried building locally on an ARM macbook, works fine. |
|
Alright, can reproduce it with |
|
My first thought is that it was a cgo issue, but if that was the case, it would have failed on |
|
@fearful-symmetry Shot in the dark, but could the build tags here have something to do with the error? |
Not necessarily, since |
@ycombinator argh, I didn't even notice there was a |
|
Put in elastic/elastic-agent-system-metrics#82 since that code doesn't require cgo. However, that raises another question, should |
|
/test |
What does this PR do?
Part of fix for #267
This accomplishes a few things:
/shipperand/statsand/stateendpoint to http monitoringhttp.expvarthat re-exports expvar metrics.golangci.ymllint file to match that in beats, since we're running into issues with cgo again.Note that this currently won't work with agent, as changes on the agent side are needed to make agent scrape metrics from the shipper.
Why is it important?
Needed to integrate the shipper with other agent monitoring
Checklist
CHANGELOG.mdorCHANGELOG-developer.md.Author's Checklist
How to test this PR locally
http.enabled: true/,/shipper,/statsand/stateendpoints, make sure they're reporting valid data.