Currently, the data stream permissions are specific to the dataset that an integration defines.
For example, when adding a custom log integration, you'll have to specify the data_stream.dataset, for example foo, and an API key will be generated with the permissions to send data to logs-foo-default.
This is limiting for integrations that define a data stream that routes events to other data streams. See:
Example: a log line like this gets sent to the logs-ecs_router-default data stream:
{
"message": "{\"@timestamp\":\"2022-04-01T12:09:12.375Z\", \"log.level\": \"INFO\", \"message\":\"With event.dataset\", \"data_stream.dataset\": \"foo\"}"
}
The default ingest pipeline for logs-ecs_router-default parses the JSON within the message and uses the data_stream.dataset from the log message to route the message to the logs-foo-default data stream (by overriding the _index field).
The issue is that this will lead to a security exception as the API key used to ingest the data only has permissions to ingest data to logs-ecs_router-default.
This is also an issue for the Azure springcloudlogs integration: All logs are always sent to the springcloudlogs data stream, even if the logs are from different application and, thus, should ideally be routed to their own data streams. Other examples are CloudWatch, k8s logs, PCF logs, and httpjson.
This relates to the discussions about input-only packages but is an independent and decoupled task.
What I'm proposing is to add a flag to the package spec that behaves similar to dataset_is_prefix
|
dataset_is_prefix: |
|
description: if true, the index pattern in the ES template will contain the dataset as a prefix only |
|
type: boolean |
|
default: false |
But instead of just adding .* to the index permissions, the flag will allow access to all data streams of a given type, such as logs-*-*.
@ruflin @mtojek @joshdover
Currently, the data stream permissions are specific to the dataset that an integration defines.
For example, when adding a custom log integration, you'll have to specify the
data_stream.dataset, for examplefoo, and an API key will be generated with the permissions to send data tologs-foo-default.This is limiting for integrations that define a data stream that routes events to other data streams. See:
Example: a log line like this gets sent to the
logs-ecs_router-defaultdata stream:The default ingest pipeline for
logs-ecs_router-defaultparses the JSON within the message and uses thedata_stream.datasetfrom the log message to route the message to thelogs-foo-defaultdata stream (by overriding the_indexfield).The issue is that this will lead to a security exception as the API key used to ingest the data only has permissions to ingest data to
logs-ecs_router-default.This is also an issue for the Azure springcloudlogs integration: All logs are always sent to the
springcloudlogsdata stream, even if the logs are from different application and, thus, should ideally be routed to their own data streams. Other examples are CloudWatch, k8s logs, PCF logs, and httpjson.This relates to the discussions about input-only packages but is an independent and decoupled task.
What I'm proposing is to add a flag to the package spec that behaves similar to
dataset_is_prefixpackage-spec/versions/1/data_stream/manifest.spec.yml
Lines 129 to 132 in d017330
But instead of just adding
.*to the index permissions, the flag will allow access to all data streams of a given type, such aslogs-*-*.@ruflin @mtojek @joshdover