Skip to content

Align JSON logs better with ECS#67266

Merged
pugnascotia merged 12 commits intoelastic:masterfrom
pugnascotia:deperecation-logging-ecs-fixes
Jan 25, 2021
Merged

Align JSON logs better with ECS#67266
pugnascotia merged 12 commits intoelastic:masterfrom
pugnascotia:deperecation-logging-ecs-fixes

Conversation

@pugnascotia
Copy link
Copy Markdown
Contributor

The JSON logs that Elasticsearch produces are roughly in an ECS shape. This PR improves that alignment.

@elasticmachine elasticmachine added the Team:Core/Infra Meta label for core/infra team label Jan 11, 2021
@elasticmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-core-infra (Team:Core/Infra)

Copy link
Copy Markdown
Member

@yaauie yaauie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my perspective, this looks to address the commentary seen elsewhere. I'm neither an ECS expert nor do I have extensive ES internals knowledge.

Copy link
Copy Markdown
Contributor

@pgomulka pgomulka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it looks great, I left few cosmetic comments

};
return new KeyValuePair[] {
new KeyValuePair("event.dataset", type),
new KeyValuePair("elasticsearch.cluster.uuid", "%cluster_id"),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

elasticsearch.* are our own defined in ECS fields?
or they are just a custom fields which are not defined there?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ry told me that the ECS team recommends prefixing Elasticsearch-specific things like cluster and node with elasticsearch.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These fields are not part of ECS but some of them exist in the Filebeat and Metricbeat module: https://www.elastic.co/guide/en/beats/metricbeat/master/metricbeat-module-elasticsearch.html It might be good to align there.

new KeyValuePair("cluster.name","${sys:es.logs.cluster_name}"),
};
return new KeyValuePair[] {
new KeyValuePair("event.dataset", type),
Copy link
Copy Markdown
Contributor

@pgomulka pgomulka Jan 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, do you think it would worthy to change the property used in log4j2.properties file ? from type_name to dataset?
I used type_name only because type is already being used by log4j. With dataset we won't have that problem and I think it would be more clear

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've changed type and type_name to dataset throughout. Do you know whether there's anything breaking about doing that?

@ruflin
Copy link
Copy Markdown
Contributor

ruflin commented Jan 12, 2021

@ravi-elastic This change might effect the collection of logs for stack monitoring.

appender.rolling.name = rolling
appender.rolling.layout.type = ECSJsonLayout
appender.rolling.layout.type_name = server
appender.rolling.layout.dataset = server
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did not play around with the code itself. But if I think of the dataset, it is a unique name. So it should be elasticsearch.server. Does this make sense?

I'm assuming it follows the logic of the new naming scheme: https://www.elastic.co/blog/an-introduction-to-the-elastic-data-stream-naming-scheme but event.dataset should not be different.

@ruflin
Copy link
Copy Markdown
Contributor

ruflin commented Jan 12, 2021

One thing that would be nice to see for review is a log event before and after the change for comparison and also to understand how the final fields look like.

@pugnascotia
Copy link
Copy Markdown
Contributor Author

@ruflin here we go:

Before:

{
  "@timestamp": "2021-01-13T15:48:16.642Z",
  "cluster.name": "runTask",
  "cluster.uuid": "EV57vhNXQ0iZqPb2pUQfFQ",
  "data_stream.dataset": "deprecation.elasticsearch",
  "data_stream.namespace": "default",
  "data_stream.type": "logs",
  "ecs.version": "1.6",
  "key": "synced_flush",
  "log.level": "DEPRECATION",
  "log.logger": "org.elasticsearch.deprecation.rest.action.admin.indices.RestSyncedFlushAction",
  "message": "Synced flush was removed and a normal flush was performed instead. This transition will be removed in a future version.",
  "node.id": "t1EH7lbISbCZylE9BJ8Z_w",
  "node.name": "runTask-0",
  "process.thread.name": "elasticsearch[runTask-0][transport_worker][T#6]",
  "service.name": "ES_ECS",
  "type": "deprecation"
}

After:

{
  "@timestamp": "2021-01-13T15:44:20.486Z",
  "data_stream.dataset": "elasticsearch.deprecation",
  "data_stream.type": "logs",
  "ecs.version": "1.6",
  "elasticsearch.cluster.name": "runTask",
  "elasticsearch.cluster.uuid": "yjW_iFxzTmGocjNKga2p-Q",
  "elasticsearch.node.id": "wKqe9WlkRIuvUwpuTecrXA",
  "elasticsearch.node.name": "runTask-0",
  "event.code": "synced_flush",
  "event.dataset": "deprecation",
  "log.level": "DEPRECATION",
  "log.logger": "org.elasticsearch.deprecation.rest.action.admin.indices.RestSyncedFlushAction",
  "message": "Synced flush was removed and a normal flush was performed instead. This transition will be removed in a future version.",
  "process.thread.name": "elasticsearch[runTask-0][transport_worker][T#6]",
  "service.name": "ES_ECS"
}

@pgomulka
Copy link
Copy Markdown
Contributor

we tend to update https://github.com/elastic/beats/tree/master/filebeat/module/elasticsearch/deprecation/test also with our sample logs. I think I was usually using some of the tests output. Will try to find this testcase and comment here

@ruflin
Copy link
Copy Markdown
Contributor

ruflin commented Jan 14, 2021

@pugnascotia This looks great. Nit: In the second event example data_stream.namespace is missing. Is that on purpose or maybe even a feature because the user can configure it?

@pugnascotia
Copy link
Copy Markdown
Contributor Author

@ruflin I must have misunderstood a comment elsewhere and thought that data_stream.namespace wasn't necessary, but your last comment implies that it is. It's not configurable, so what value should it have?

We only want to change `ECSJsonLayout`.
@ruflin
Copy link
Copy Markdown
Contributor

ruflin commented Jan 15, 2021

If it is not configurable, it should default to default as the value. If the logs are collected by Elastic Agent, I would assume this would be filled in / completed if missing or overwritten if a different namespace should be selected. But default seems like a good choice for now.

@pugnascotia pugnascotia merged commit c841b2c into elastic:master Jan 25, 2021
@pugnascotia pugnascotia deleted the deperecation-logging-ecs-fixes branch January 25, 2021 10:43
pugnascotia added a commit to pugnascotia/elasticsearch that referenced this pull request Jan 25, 2021
The JSON logs that Elasticsearch produces are roughly in an ECS shape. This PR improves
that alignment.
pugnascotia added a commit that referenced this pull request Jan 26, 2021
Backport / reimplementation of #67266.

The JSON logs that Elasticsearch produces are roughly in an ECS shape. This PR improves
that alignment. Since `7.x` has it's own version of `EcsJsonLayout`, most of the changes
are there, though I also had to backport `ClusterIdConverter` and `NodeIdConverter`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Core/Infra/Logging Log management and logging utilities >enhancement Team:Core/Infra Meta label for core/infra team v7.12.0 v8.0.0-alpha1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants