Support trace groups in standard otel proto codec#5539
Support trace groups in standard otel proto codec#5539kkondaka merged 4 commits intoopensearch-project:mainfrom
Conversation
|
@KarstenSchnitter @dlvenable please review |
KarstenSchnitter
left a comment
There was a problem hiding this comment.
Thanks for providing this change.
I managed to review the logs and metrics template. Some of the comments on logs translate to metrics as well. I will review more of this PR over the weekend.
| "type": "text", | ||
| "fields": { | ||
| "keyword": { | ||
| "type": "keyword", | ||
| "ignore_above": 128 | ||
| } | ||
| } |
There was a problem hiding this comment.
I do not think, that it is necessary to do this kind of pattern on almost all of the OTel fields. They are usually very short to save space. instrumentationScope.name usually takes the form of a url, e.g. io.opentelemetry.runtime-telemetry-java8. I recommend using only the keyword field type. 128 characters also seems to be more than enough. Is there a particular reason to go for text as well?
| "fields": { | ||
| "keyword": { | ||
| "type": "keyword", | ||
| "ignore_above": 256 |
There was a problem hiding this comment.
This is a semantic version number. 256 seams way to long IMHO.
| "path": "observedTimestamp" | ||
| }, | ||
| "traceId": { | ||
| "ignore_above": 256, |
There was a problem hiding this comment.
This will always be 32 characters of a hex encoded 16 byte array.
| "type": "keyword" | ||
| }, | ||
| "spanId": { | ||
| "ignore_above": 256, |
There was a problem hiding this comment.
This will always be 16 characters of a hex encoded 8 byte array.
| "type": "keyword" | ||
| }, | ||
| "flags": { | ||
| "type": "integer" |
There was a problem hiding this comment.
This is a bit mask. I am not sure, whether there are better field types available.
...ns/opensearch/src/main/resources/index-template/metrics-otel-v1-index-standard-template.json
Show resolved
Hide resolved
| }, | ||
| "kind": { | ||
| "type": "keyword", | ||
| "ignore_above": 128 |
There was a problem hiding this comment.
This is actually an enum with well-known values. The longest is EXPONENTIAL_HISTOGRAM.
...ns/opensearch/src/main/resources/index-template/metrics-otel-v1-index-standard-template.json
Show resolved
Hide resolved
| "type": "date" | ||
| }, | ||
| "@timestamp": { | ||
| "type": "date" |
There was a problem hiding this comment.
Why is this not date_nanos?
| "type": "double" | ||
| }, | ||
| "max": { | ||
| "type": "float" |
There was a problem hiding this comment.
Why not use double?
KarstenSchnitter
left a comment
There was a problem hiding this comment.
I reviewed everything I left out in my last review. I haved worked quite a lot with mapping templates for OTel data. I can only recommend to use component templates to ease the definition. Resource attributes and Instrumentation Scope handling can be shared amongst the signals. It needs to be the same for efficient analysis anyways. We also should give users the option to extend the templates for covering semantic conventions on the attributes, e..g., specifying http.status.code to be an integer. I added a lot of coments on the mappings, which also apply to the other signals and sometimes other fields.
| "resource_attributes_map": { | ||
| "mapping": { | ||
| "type": "keyword" | ||
| }, | ||
| "path_match": "resource.attributes.*" | ||
| } | ||
| }, | ||
| { | ||
| "span_attributes_map": { | ||
| "mapping": { | ||
| "type": "keyword" | ||
| }, | ||
| "path_match": "span.attributes.*" | ||
| } | ||
| } | ||
| ], |
There was a problem hiding this comment.
Why do you porpose dynamic templates for span and resource attributes but not instrumentation scope? For logs and metrics there was a mapping to object. I like ethe dynamic templating better. But this should be uniform for all three signals and all three attribute collections.
...s/opensearch/src/main/resources/index-template/otel-v1-apm-span-index-standard-template.json
Show resolved
Hide resolved
| } | ||
| }, | ||
| "kind": { | ||
| "ignore_above": 128, |
There was a problem hiding this comment.
This is actually an enum with known values. The longest is SPAN_KIND_UNSPECIFIED.
...s/opensearch/src/main/resources/index-template/otel-v1-apm-span-index-standard-template.json
Show resolved
Hide resolved
| "attributes": { | ||
| "type": "object" | ||
| }, |
There was a problem hiding this comment.
This is semantically the same as the span attributes from the dynamic templates.
| "type": "text", | ||
| "fields": { | ||
| "keyword": { | ||
| "type": "keyword", | ||
| "ignore_above": 256 | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
Just using a keyword should be sufficient. There are two schemaUrls though: One for resource attributes and one for instrumenation scope attributes. Logs and metrics are missing this entry entirely.
| "ignore_above": 1024, | ||
| "type": "keyword" | ||
| }, | ||
| "traceGroupFields": { |
There was a problem hiding this comment.
It would actually be nice to add (resource) attributes to the trace group in a future extension.
| }, | ||
| "flags": 0, | ||
| "severityNumber": 2, | ||
| "serviceName": "testServiceName", |
There was a problem hiding this comment.
This should have stayed.
| "kind": "GAUGE", | ||
| "flags": 0, | ||
| "description": "testGaugeDescription", | ||
| "serviceName": null, |
There was a problem hiding this comment.
THis should have stayed, but not using null which is somewhat invalid. All OTel resources SHOULD have a service name.
...mmon/src/main/java/org/opensearch/dataprepper/plugins/otel/codec/OTelProtoStandardCodec.java
Show resolved
Hide resolved
|
@KarstenSchnitter thank you very much for the detailed review. Since you have experience, would it be possible to provide me the complete template files? |
Yes, I am going to work on it today. |
|
@kkondaka please find my proposal in kkondaka#2. |
| { | ||
| "double_attributes": { | ||
| "mapping": { | ||
| "type": "double" | ||
| }, | ||
| "path_match": "attributes.*", | ||
| "match_mapping_type": "double" | ||
| } | ||
| } |
There was a problem hiding this comment.
Can you add at least string attributs, if the default was not working? And similar for the other attributes.
| { | |
| "double_attributes": { | |
| "mapping": { | |
| "type": "double" | |
| }, | |
| "path_match": "attributes.*", | |
| "match_mapping_type": "double" | |
| } | |
| } | |
| { | |
| "double_attributes": { | |
| "mapping": { | |
| "type": "double" | |
| }, | |
| "path_match": "attributes.*", | |
| "match_mapping_type": "double" | |
| } | |
| }, | |
| { | |
| "string_attributes": { | |
| "mapping": { | |
| "type": "keyword", | |
| "ignore_above": 256 | |
| }, | |
| "path_match": "attributes.*", | |
| "match_mapping_type": "string" | |
| } | |
| } | |
There was a problem hiding this comment.
@KarstenSchnitter I have the change and verified that it works great. Please take a look at the diffs. Let's plan to make any other suggestions you may have in another PR
There was a problem hiding this comment.
@kkondaka: Great, that it works. Can you add the string attributes for resource and instrumentation scope attributes as well? Then we should be good to go.
Signed-off-by: Krishna Kondaka <krishkdk@amazon.com>
…gs/traces/metrics Signed-off-by: Krishna Kondaka <krishkdk@amazon.com>
Signed-off-by: Krishna Kondaka <krishkdk@amazon.com>
ecdd704 to
076047f
Compare
Signed-off-by: Krishna Kondaka <krishkdk@amazon.com>
KarstenSchnitter
left a comment
There was a problem hiding this comment.
The changed (non-standard) metrics template is now missing the string attributes. Please add them. Otherwise, this looks good.
| "path_match": "resource.attributes.*", | ||
| "match_mapping_type": "double" | ||
| } | ||
| }, |
There was a problem hiding this comment.
This one is now missing the string attributes for resources, scope and plain.
There was a problem hiding this comment.
Thanks @KarstenSchnitter. Will do a separate PR for this.
There was a problem hiding this comment.
These are existing mappings. Not focusing on these in this PR, anyways.
| for (final Span span : spans) { | ||
| assertThat(span.getTraceGroup(), nullValue()); | ||
| assertThat(span.getTraceGroupFields(), nullValue()); | ||
| if (span.getParentSpanId().isEmpty()) { |
There was a problem hiding this comment.
This is a somewhat loose assertion. Do we even know if we ever get non-empty spans?
There was a problem hiding this comment.
This is the exact same code that exists in OTelProtoOpensearchCodecTest. I did not include previously because there was not tracegroups support previously.
There was a problem hiding this comment.
Yea, I see that you have expanded the scope of this test.
Could it be split into two different tests? One for each scenario?
There was a problem hiding this comment.
@dlvenable The test data is coming from resources/test-request.json and it has two spans with parentSpan and one without parentSpan. So, I think this test is fine.
| private static final String DROPPED_LINKS_COUNT_KEY = "droppedLinksCount"; | ||
| private static final String SERVICE_NAME_KEY = "serviceName"; | ||
| private static final String TRACE_GROUP_KEY = "traceGroup"; | ||
| public static final String SERVICE_NAME_KEY = "serviceName"; |
There was a problem hiding this comment.
If you are using these in the tests, make them package protected. Remove public.
There was a problem hiding this comment.
It needs to be public because it is used in data-prepper-plugins/otel-proto-common/src/main/java/org/opensearch/dataprepper/plugins/otel/codec/OTelProtoStandardCodec.java
|
|
||
| public enum IndexType { | ||
| TRACE_ANALYTICS_RAW("trace-analytics-raw"), | ||
| TRACE_ANALYTICS_RAW_STANDARD("trace-analytics-standard-raw"), |
There was a problem hiding this comment.
The term standard is unclear. Can you give a more precise adjective? Perhaps otel?
There was a problem hiding this comment.
We can resolve in a follow-on PR.
…#5539) * Support trace groups in standard otel proto codec Signed-off-by: Krishna Kondaka <krishkdk@amazon.com> * Addressed review comments. Verified the functionality for standard logs/traces/metrics Signed-off-by: Krishna Kondaka <krishkdk@amazon.com> * Addressed review comments Signed-off-by: Krishna Kondaka <krishkdk@amazon.com> * Added string mappings for resource and scope Signed-off-by: Krishna Kondaka <krishkdk@amazon.com> --------- Signed-off-by: Krishna Kondaka <krishkdk@amazon.com>
…#5539) * Support trace groups in standard otel proto codec Signed-off-by: Krishna Kondaka <krishkdk@amazon.com> * Addressed review comments. Verified the functionality for standard logs/traces/metrics Signed-off-by: Krishna Kondaka <krishkdk@amazon.com> * Addressed review comments Signed-off-by: Krishna Kondaka <krishkdk@amazon.com> * Added string mappings for resource and scope Signed-off-by: Krishna Kondaka <krishkdk@amazon.com> --------- Signed-off-by: Krishna Kondaka <krishkdk@amazon.com>
…#5539) * Support trace groups in standard otel proto codec Signed-off-by: Krishna Kondaka <krishkdk@amazon.com> * Addressed review comments. Verified the functionality for standard logs/traces/metrics Signed-off-by: Krishna Kondaka <krishkdk@amazon.com> * Addressed review comments Signed-off-by: Krishna Kondaka <krishkdk@amazon.com> * Added string mappings for resource and scope Signed-off-by: Krishna Kondaka <krishkdk@amazon.com> --------- Signed-off-by: Krishna Kondaka <krishkdk@amazon.com>
…#5539) * Support trace groups in standard otel proto codec Signed-off-by: Krishna Kondaka <krishkdk@amazon.com> * Addressed review comments. Verified the functionality for standard logs/traces/metrics Signed-off-by: Krishna Kondaka <krishkdk@amazon.com> * Addressed review comments Signed-off-by: Krishna Kondaka <krishkdk@amazon.com> * Added string mappings for resource and scope Signed-off-by: Krishna Kondaka <krishkdk@amazon.com> --------- Signed-off-by: Krishna Kondaka <krishkdk@amazon.com> Signed-off-by: mamol27 <mamol27@yandex.ru>
Description
Issues Resolved
Resolves #5259
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.