[filebeat] Elasticsearch state storage for httpjson and cel inputs#41446
[filebeat] Elasticsearch state storage for httpjson and cel inputs#41446orestisfl merged 77 commits intoelastic:mainfrom
Conversation
|
This pull request does not have a backport label.
To fixup this pull request, you need to add the backport labels for the needed
|
|
|
|
@belimawr @cmacknz (or whoever wants/have time to be involved)
|
|
@leehinman I'd appreciate a review here to make sure this can co-exist with Beats receivers in agent since that would be the long term way we plan to run agentless inputs. |
cmacknz
left a comment
There was a problem hiding this comment.
LGTM, the new test looks good.
Biggest outstanding thing I see is some ability to tell how well this is working and if it is overloading ES. We can address this as a follow up.
|
|
||
| type StateStore interface { | ||
| Access() (*statestore.Store, error) | ||
| Access(typ string) (*statestore.Store, error) |
There was a problem hiding this comment.
Suggest something like
// Access returns the storage registry depending on the type.
// The value of typ is expected to have been obtained from
// [cursor.InputManager.Type] and represents the input type.
zmoog
left a comment
There was a problem hiding this comment.
LGTM is limited to simple changes at x-pack/filebeat/input/awss3/.
Nit: calling a method with an empty string (stateStore.Access("")) feels slightly weird.
ishleenk17
left a comment
There was a problem hiding this comment.
Approving it from obs-infraobs side while considering the change done to the salesforce input.
tommyers-elastic
left a comment
There was a problem hiding this comment.
reviewing (minimal) changes in x-pack/filebeat on behalf of obs-infraobs-integrations.
looks fine to me.
|
run docs-build |
|
/test |
…41446) This enables Elasticsearch as State Store Backend for Security Integrations for the Agentless solution. The scope of this change was narrowed down to supporting only `httpjson` inputs in order to support Okta integration for the initial release. All the other integrations inputs still use the file storage as before. This is a short term solution for the state storage for k8s. The feature currently can only be enabled with the `AGENTLESS_ELASTICSEARCH_STATE_STORE_INPUT_TYPES` env var. The existing code relied on the inputs state storage to be fully configurable before the main beat managers runs. The change delays the configuration of `httpjson` input to the time when the actual configuration is received from the Agent. Example of the state storage index content for Okta integration: ``` { "took": 6, "timed_out": false, "_shards": { "total": 1, "successful": 1, "skipped": 0, "failed": 0 }, "hits": { "total": { "value": 1, "relation": "eq" }, "max_score": 1, "hits": [ { "_index": "agentless-state-httpjson-okta.system-028ecf4b-babe-44c6-939e-9e3096af6959", "_id": "httpjson::httpjson-okta.system-028ecf4b-babe-44c6-939e-9e3096af6959::https://dev-36006609.okta.com/api/v1/logs", "_seq_no": 39, "_primary_term": 1, "_score": 1, "_source": { "v": { "ttl": 1800000000000, "updated": "2024-10-24T20:21:22.032Z", "cursor": { "published": "2024-10-24T20:19:53.542Z" } } } } ] } } ``` The naming convention for all state store is `agentless-state-<input id>`, since the expectation for agentless we would have only one agent per policy and the agents are ephemeral. Closes https://github.com/elastic/security-team/issues/11101 Co-authored-by: Orestis Floros <orestis.floros@elastic.co> (cherry picked from commit 8180f23) # Conflicts: # filebeat/beater/filebeat.go
…lastic#41446) This enables Elasticsearch as State Store Backend for Security Integrations for the Agentless solution. The scope of this change was narrowed down to supporting only `httpjson` inputs in order to support Okta integration for the initial release. All the other integrations inputs still use the file storage as before. This is a short term solution for the state storage for k8s. The feature currently can only be enabled with the `AGENTLESS_ELASTICSEARCH_STATE_STORE_INPUT_TYPES` env var. The existing code relied on the inputs state storage to be fully configurable before the main beat managers runs. The change delays the configuration of `httpjson` input to the time when the actual configuration is received from the Agent. Example of the state storage index content for Okta integration: ``` { "took": 6, "timed_out": false, "_shards": { "total": 1, "successful": 1, "skipped": 0, "failed": 0 }, "hits": { "total": { "value": 1, "relation": "eq" }, "max_score": 1, "hits": [ { "_index": "agentless-state-httpjson-okta.system-028ecf4b-babe-44c6-939e-9e3096af6959", "_id": "httpjson::httpjson-okta.system-028ecf4b-babe-44c6-939e-9e3096af6959::https://dev-36006609.okta.com/api/v1/logs", "_seq_no": 39, "_primary_term": 1, "_score": 1, "_source": { "v": { "ttl": 1800000000000, "updated": "2024-10-24T20:21:22.032Z", "cursor": { "published": "2024-10-24T20:19:53.542Z" } } } } ] } } ``` The naming convention for all state store is `agentless-state-<input id>`, since the expectation for agentless we would have only one agent per policy and the agents are ephemeral. Closes https://github.com/elastic/security-team/issues/11101 Co-authored-by: Orestis Floros <orestis.floros@elastic.co> (cherry picked from commit 8180f23) # Conflicts: # filebeat/beater/filebeat.go
…41446) This enables Elasticsearch as State Store Backend for Security Integrations for the Agentless solution. The scope of this change was narrowed down to supporting only `httpjson` inputs in order to support Okta integration for the initial release. All the other integrations inputs still use the file storage as before. This is a short term solution for the state storage for k8s. The feature currently can only be enabled with the `AGENTLESS_ELASTICSEARCH_STATE_STORE_INPUT_TYPES` env var. The existing code relied on the inputs state storage to be fully configurable before the main beat managers runs. The change delays the configuration of `httpjson` input to the time when the actual configuration is received from the Agent. Example of the state storage index content for Okta integration: ``` { "took": 6, "timed_out": false, "_shards": { "total": 1, "successful": 1, "skipped": 0, "failed": 0 }, "hits": { "total": { "value": 1, "relation": "eq" }, "max_score": 1, "hits": [ { "_index": "agentless-state-httpjson-okta.system-028ecf4b-babe-44c6-939e-9e3096af6959", "_id": "httpjson::httpjson-okta.system-028ecf4b-babe-44c6-939e-9e3096af6959::https://dev-36006609.okta.com/api/v1/logs", "_seq_no": 39, "_primary_term": 1, "_score": 1, "_source": { "v": { "ttl": 1800000000000, "updated": "2024-10-24T20:21:22.032Z", "cursor": { "published": "2024-10-24T20:19:53.542Z" } } } } ] } } ``` The naming convention for all state store is `agentless-state-<input id>`, since the expectation for agentless we would have only one agent per policy and the agents are ephemeral. Closes elastic/security-team#11101 Co-authored-by: Aleksandr Maus <aleksandr.maus@elastic.co> Co-authored-by: Orestis Floros <orestis.floros@elastic.co> (cherry picked from commit 8180f23)
…pjson and cel inputs (#42451) This enables Elasticsearch as State Store Backend for Security Integrations for the Agentless solution. The scope of this change was narrowed down to supporting only `httpjson` inputs in order to support Okta integration for the initial release. All the other integrations inputs still use the file storage as before. This is a short term solution for the state storage for k8s. The feature currently can only be enabled with the `AGENTLESS_ELASTICSEARCH_STATE_STORE_INPUT_TYPES` env var. The existing code relied on the inputs state storage to be fully configurable before the main beat managers runs. The change delays the configuration of `httpjson` input to the time when the actual configuration is received from the Agent. Example of the state storage index content for Okta integration: ``` { "took": 6, "timed_out": false, "_shards": { "total": 1, "successful": 1, "skipped": 0, "failed": 0 }, "hits": { "total": { "value": 1, "relation": "eq" }, "max_score": 1, "hits": [ { "_index": "agentless-state-httpjson-okta.system-028ecf4b-babe-44c6-939e-9e3096af6959", "_id": "httpjson::httpjson-okta.system-028ecf4b-babe-44c6-939e-9e3096af6959::https://dev-36006609.okta.com/api/v1/logs", "_seq_no": 39, "_primary_term": 1, "_score": 1, "_source": { "v": { "ttl": 1800000000000, "updated": "2024-10-24T20:21:22.032Z", "cursor": { "published": "2024-10-24T20:19:53.542Z" } } } } ] } } ``` The naming convention for all state store is `agentless-state-<input id>`, since the expectation for agentless we would have only one agent per policy and the agents are ephemeral. Closes https://github.com/elastic/security-team/issues/11101 (cherry picked from commit 8180f23) Co-authored-by: Aleksandr Maus <aleksandr.maus@elastic.co> Co-authored-by: Orestis Floros <orestis.floros@elastic.co>
Proposed commit message
[filebeat] Elasticsearch state storage for httpjson input
This is a POC for Elasticsearch as State Store Backend for Security Integrations for Agentless solution.
The scope of this change was narrowed down to supporting only
httpjsoninputs in order to support Okta integration for the initial release. All the other integrations inputs still use the file storage as before.This is a short term solution for the state storage for k8s environment.
This is the first cut and the details can change depending on the feedback.
Current feature currently could be enabled
AGENTLESS_ELASTICSEARCH_STATE_STORE_ENABLED, to be decided how this would be configurable in k8s.This change currently contains the hacky approach to the
AGENTLESS_ELASTICSEARCH_APIKEYoverwrite. This allows to the user to provide the ApiKey with elevated permissions that are required in order to be able to create/write/read the state index per input. THIS IS FOR DEVELOPMENT/TESTING ONLY. REMOVE BEFORE THE MERGE.The existing code relied on the inputs state storage to be fully configurable before the main beat managers runs. The change delays the configuration of
httpjsoninput to the time when the actual configuration is received from the Agent.There is an assumption that the index template for the state storage indices is already in place before the storage is used
Example of the state storage index content for Okta integration:
The naming convention for all state store is
agentless-state-<input id>, since the expectation for agentless we would have only one agent per policy and the agents are ephemeral.Currently in order to run the agent with Elasticsearch state storage a couple of environment variables would be required:
where the ApiKey in the
DEPENDENCIES / TODOS:
Checklist
CHANGELOG.next.asciidocorCHANGELOG-developer.next.asciidoc.Disruptive User Impact
The change should have no impact, and without the feature enabled the filebeat should work as before using the file system storage for the state.