Align JSON logs better with ECS#67266
Conversation
|
Pinging @elastic/es-core-infra (Team:Core/Infra) |
yaauie
left a comment
There was a problem hiding this comment.
From my perspective, this looks to address the commentary seen elsewhere. I'm neither an ECS expert nor do I have extensive ES internals knowledge.
pgomulka
left a comment
There was a problem hiding this comment.
it looks great, I left few cosmetic comments
qa/logging-config/src/test/java/org/elasticsearch/common/logging/JsonLoggerTests.java
Outdated
Show resolved
Hide resolved
| }; | ||
| return new KeyValuePair[] { | ||
| new KeyValuePair("event.dataset", type), | ||
| new KeyValuePair("elasticsearch.cluster.uuid", "%cluster_id"), |
There was a problem hiding this comment.
elasticsearch.* are our own defined in ECS fields?
or they are just a custom fields which are not defined there?
There was a problem hiding this comment.
Ry told me that the ECS team recommends prefixing Elasticsearch-specific things like cluster and node with elasticsearch.
There was a problem hiding this comment.
These fields are not part of ECS but some of them exist in the Filebeat and Metricbeat module: https://www.elastic.co/guide/en/beats/metricbeat/master/metricbeat-module-elasticsearch.html It might be good to align there.
server/src/main/java/org/elasticsearch/common/logging/RateLimitingFilter.java
Show resolved
Hide resolved
...ion/qa/rest/src/javaRestTest/java/org/elasticsearch/xpack/deprecation/DeprecationHttpIT.java
Show resolved
Hide resolved
| new KeyValuePair("cluster.name","${sys:es.logs.cluster_name}"), | ||
| }; | ||
| return new KeyValuePair[] { | ||
| new KeyValuePair("event.dataset", type), |
There was a problem hiding this comment.
also, do you think it would worthy to change the property used in log4j2.properties file ? from type_name to dataset?
I used type_name only because type is already being used by log4j. With dataset we won't have that problem and I think it would be more clear
There was a problem hiding this comment.
I've changed type and type_name to dataset throughout. Do you know whether there's anything breaking about doing that?
|
@ravi-elastic This change might effect the collection of logs for stack monitoring. |
| appender.rolling.name = rolling | ||
| appender.rolling.layout.type = ECSJsonLayout | ||
| appender.rolling.layout.type_name = server | ||
| appender.rolling.layout.dataset = server |
There was a problem hiding this comment.
Did not play around with the code itself. But if I think of the dataset, it is a unique name. So it should be elasticsearch.server. Does this make sense?
I'm assuming it follows the logic of the new naming scheme: https://www.elastic.co/blog/an-introduction-to-the-elastic-data-stream-naming-scheme but event.dataset should not be different.
|
One thing that would be nice to see for review is a log event before and after the change for comparison and also to understand how the final fields look like. |
|
@ruflin here we go: Before: {
"@timestamp": "2021-01-13T15:48:16.642Z",
"cluster.name": "runTask",
"cluster.uuid": "EV57vhNXQ0iZqPb2pUQfFQ",
"data_stream.dataset": "deprecation.elasticsearch",
"data_stream.namespace": "default",
"data_stream.type": "logs",
"ecs.version": "1.6",
"key": "synced_flush",
"log.level": "DEPRECATION",
"log.logger": "org.elasticsearch.deprecation.rest.action.admin.indices.RestSyncedFlushAction",
"message": "Synced flush was removed and a normal flush was performed instead. This transition will be removed in a future version.",
"node.id": "t1EH7lbISbCZylE9BJ8Z_w",
"node.name": "runTask-0",
"process.thread.name": "elasticsearch[runTask-0][transport_worker][T#6]",
"service.name": "ES_ECS",
"type": "deprecation"
}After: {
"@timestamp": "2021-01-13T15:44:20.486Z",
"data_stream.dataset": "elasticsearch.deprecation",
"data_stream.type": "logs",
"ecs.version": "1.6",
"elasticsearch.cluster.name": "runTask",
"elasticsearch.cluster.uuid": "yjW_iFxzTmGocjNKga2p-Q",
"elasticsearch.node.id": "wKqe9WlkRIuvUwpuTecrXA",
"elasticsearch.node.name": "runTask-0",
"event.code": "synced_flush",
"event.dataset": "deprecation",
"log.level": "DEPRECATION",
"log.logger": "org.elasticsearch.deprecation.rest.action.admin.indices.RestSyncedFlushAction",
"message": "Synced flush was removed and a normal flush was performed instead. This transition will be removed in a future version.",
"process.thread.name": "elasticsearch[runTask-0][transport_worker][T#6]",
"service.name": "ES_ECS"
} |
|
we tend to update https://github.com/elastic/beats/tree/master/filebeat/module/elasticsearch/deprecation/test also with our sample logs. I think I was usually using some of the tests output. Will try to find this testcase and comment here |
|
@pugnascotia This looks great. Nit: In the second event example |
|
@ruflin I must have misunderstood a comment elsewhere and thought that |
We only want to change `ECSJsonLayout`.
|
If it is not configurable, it should default to |
The JSON logs that Elasticsearch produces are roughly in an ECS shape. This PR improves that alignment.
Backport / reimplementation of #67266. The JSON logs that Elasticsearch produces are roughly in an ECS shape. This PR improves that alignment. Since `7.x` has it's own version of `EcsJsonLayout`, most of the changes are there, though I also had to backport `ClusterIdConverter` and `NodeIdConverter`.
The JSON logs that Elasticsearch produces are roughly in an ECS shape. This PR improves that alignment.