You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The entity-analytics input stores state in a local bbolt database (one file per input at <data_dir>/kvstore/<input_id>.db). Agentless environments don't provide persistent local storage, so the input can't run there.
This issue tracks adding a minimal-state mode that stores only checkpoint data (continuation tokens, timestamps, entity ID sets) in the Elasticsearch state store — the same mechanism CEL and httpjson inputs already use in agentless deployments. The minimal-state mode coexists with the existing implementation behind a config option.
Approach
The current implementation stores full entity data (users, devices, groups) locally for deletion detection and, in EntraID's case, transitive group membership computation. All four providers can operate with just checkpoint state:
EntraID achieves minimal state by fetching all groups with members on each sync and computing transitive membership in working storage rather than persisting the full entity graph between syncs.
For AD, Okta, and Jamf, deletion detection stores the previous sync's entity ID set in the ES state store.
A config option selects the implementation. Both modes produce identical output documents, so the legacy implementation remains available as a fallback during validation. The option will be hidden in agentless mode where minimal-state is required.
Implementation runtime
The minimal-state approach applies regardless of implementation runtime. Two paths are available:
OTel receiver: Implement as a new OpenTelemetry receiver. The elasticsearchstorage extension provides ES-backed state storage in the OTel runtime, making this path viable.
The per-provider sync flows, state requirements, and output documents are identical in both cases.
Fleet integration packages will default to minimal-state mode and hide the option in agentless deployments.
Downstream latest transforms can provide a "current state" view of entities and handle deletion detection via TTL. These are recommended but not required by this work.
The entity-analytics input stores state in a local bbolt database (one file per input at
<data_dir>/kvstore/<input_id>.db). Agentless environments don't provide persistent local storage, so the input can't run there.This issue tracks adding a minimal-state mode that stores only checkpoint data (continuation tokens, timestamps, entity ID sets) in the Elasticsearch state store — the same mechanism CEL and httpjson inputs already use in agentless deployments. The minimal-state mode coexists with the existing implementation behind a config option.
Approach
The current implementation stores full entity data (users, devices, groups) locally for deletion detection and, in EntraID's case, transitive group membership computation. All four providers can operate with just checkpoint state:
whenChangedtimestampwhenChangedtimestamp + entity ID setlastUpdatedtimestamp + entity ID setEntraID achieves minimal state by fetching all groups with members on each sync and computing transitive membership in working storage rather than persisting the full entity graph between syncs.
For AD, Okta, and Jamf, deletion detection stores the previous sync's entity ID set in the ES state store.
A config option selects the implementation. Both modes produce identical output documents, so the legacy implementation remains available as a fallback during validation. The option will be hidden in agentless mode where minimal-state is required.
Implementation runtime
The minimal-state approach applies regardless of implementation runtime. Two paths are available:
statestore.States(x-pack/filebeat/input/entityanalytics: refactor input to accept statestore.States #49160).elasticsearchstorageextension provides ES-backed state storage in the OTel runtime, making this path viable.The per-provider sync flows, state requirements, and output documents are identical in both cases.
Related work
entity-analyticstoAGENTLESS_ELASTICSEARCH_STATE_STORE_INPUT_TYPES.Sub-issues
statestore.Statesand add config option for minimal-state mode