Introduce otlp/http support in OTelTraceSource#5322
Introduce otlp/http support in OTelTraceSource#5322KarstenSchnitter merged 31 commits intoopensearch-project:mainfrom
Conversation
KarstenSchnitter
left a comment
There was a problem hiding this comment.
Thanks for providing this PR. This is intended to start a discussion about this change. @dlvenable we would like your opinion as well as of all other maintainers. There are still a couple of TODOs in the comment, that need to be addressed. I think, they mark points waiting for clarification.
Tomas, can you explain a little bit more about the upgrade of the otlp-proto classes. Did you need to remove the instrumentation library support because of that? Do you know, how this change behaves if the old format is ingested?
.../integrationTest/java/org/opensearch/dataprepper/plugins/kafka/buffer/KafkaBufferOTelIT.java
Outdated
Show resolved
Hide resolved
...st/java/org/opensearch/dataprepper/plugins/processor/otelmetrics/MetricsPluginGaugeTest.java
Outdated
Show resolved
Hide resolved
...c/test/java/org/opensearch/dataprepper/plugins/source/otelmetrics/OTelMetricsSourceTest.java
Outdated
Show resolved
Hide resolved
...a/org/opensearch/dataprepper/plugins/source/otelmetrics/OTelMetricsSource_RetryInfoTest.java
Outdated
Show resolved
Hide resolved
...proto-common/src/main/java/org/opensearch/dataprepper/plugins/otel/codec/OTelProtoCodec.java
Outdated
Show resolved
Hide resolved
Yes: According to the deprecation message in 0.16.0
My Tests so far:
All the tests have been made against the new http endpoint (using json as input). I have yet to test behaviour of the gRPC endpoint (I don't expect a different behaviour, though, since both endpoints use the same decoding functionality) [UPDATE ON TESTING 2025/01/17] The spec of 0.16.0 carried details on how senders and receivers should behave during the grace period:
According to that comment, the data-prepper behaves correctly now, and did adhere to the specfication in the past (by using either IMO, if Data-Prepper clients followed the recommendations, there should be no friction upgrading the proto classes. |
|
I discovered a comment that could be important for backward compatibility in the 0.16.0 deprecation notice:
I'm not that fluent in protbuf speech yet to really grasp the implication of that statement. Does |
|
@TomasLongo I will check the wire compatibility. Can you have a look at the This is also causing the failing tests on the build system. |
| // - it becomes testable | ||
| // cons: | ||
| // - currently tied to one impl by using 'new'. | ||
| return new HttpBasicArmeriaHttpAuthenticationProvider(new HttpBasicAuthenticationConfig(pluginSettings.get("username").toString(), pluginSettings.get("password").toString())); |
There was a problem hiding this comment.
Check this line. This causes an NPE in testing, when pluginsSettings is an empty map. Run test case OTelTraceSourceTest.testOptionalHttpAuthServiceInPlaceWithUnauthenticatedDisabled.
There was a problem hiding this comment.
This is one of the two tests (the other beeing testOptionalHttpAuthServiOceInPlace) that causes NPEs. I disabled them for now! However, we should discuss if they are still needed.
The reason is, that the test assumes an invalid auth-config, which data-prepper refuses to ingest in production. The config in the test looks as follows:
otel_trace_source:
authentication:
test:
The tests are nonetheless green in main, because the critical code, the PluginFactory, is completely mocked away in the test. This branch produces NPEs, because the new HTTPService does not use the Pluginfactory to instantiate the AuthProvider.
IMO, this tests could be removed.
There was a problem hiding this comment.
@TomasLongo , We use plugins for authentication to allow for custom authentication which may not always be available in the open source project. We should continue that pattern here as well. Also, we should avoid having passwords and usernames stored in the code.
There was a problem hiding this comment.
ok. done. The auth provider us now created via the plugin factory.
I'm not sure what you mean by "passwords and usernames stored in code", though. I could not find any credentials stored in the code.
|
@TomasLongo Some tests (like ':e2e-test:trace:compileIntegrationTestJava') are failing. Please check |
|
@TomasLongo @KarstenSchnitter @dlvenable should we split this in to 2 PRs? One for just updated OTEL spec version? And another for OTLP_HTTP support? |
6decceb to
090127c
Compare
|
@kkondaka @KarstenSchnitter @dlvenable As suggested, I've created a new PR #5434, containing only the updated otel proto buf spec. I've rebased this branch in order to simply create a patch of the changes for the new spec. You'll need to refetch the branch if you have checked it out before. I'll cleanup this PR here and incorporate your suggestions. However, we still have to address the configuration of the new http service. Any thoughts on that topic? Or should I come up with with a first draft for this? |
|
I've cleaned up the PR as far as I could, without needing further input from maintainer side. The remaining todos cover the following topics:
|
|
Here's a first draft, of how the config might look like. The assumption is, that both endpoints serve the same purpose, so they should share as much config as possible. That's why things like source:
otel_trace_source:
retry_info:
compression: 'none' | 'gzip'
ssl: boolean
request_timeout: int
authentication:
health_check_service: boolean
http:
path: string
grpc:
proto_reflection_service: booleanthe other option could be complete separation. e.g. every config property is duplicated per service, so that they can be configured independently. |
919f533 to
71eb4ab
Compare
|
I like the configuration proposed by @TomasLongo. @dlvenable: What do you think about it? |
|
@TomasLongo has rebased this PR onto main. It now contains authentication. @dlvenable and @kkondaka can you please give some feedback on the trace source changes. This PR is still a draft as this implementation is to be ported to metrics and logs as well.If you prefer smaller PRs, we can turn this PR from draft to ready for review and provide metrics and logs sources in different PRs. What is your take on this? |
|
@TomasLongo please have a look at the failing end-to-end test. Please try to fix them. If the failures are not caused by this PR, please leave us a note about this as a comment. |
065b5b4 to
bd62996
Compare
|
@KarstenSchnitter I could fix the checkstyle stuff. However, the test currently runs into a timeout, which I can not reproduce locally. |
bd62996 to
84f74eb
Compare
|
@dlvenable @KarstenSchnitter I rebased the PR onto main. It's up to date now. |
|
All failed builds are docker builds. It seems like the download of the openjdk-17 is not possible. It quits with a 403. |
|
Thanks for working on that change. I would recommend to mark this as ready for review. Based on the review of this PR, there will be more for the remaining OTel sources including the new one. @dlvenable, what is your thought on that? |
…ccepts unframed requests Signed-off-by: Tomas Longo <tomas.longo@sap.com>
Signed-off-by: Tomas Longo <tomas.longo@sap.com>
Signed-off-by: Tomas Longo <tomas.longo@sap.com>
Signed-off-by: Tomas Longo <tomas.longo@sap.com>
Signed-off-by: Tomas Longo <tomas.longo@sap.com>
Signed-off-by: Tomas Longo <tomas.longo@sap.com>
Signed-off-by: Tomas Longo <tomas.longo@sap.com>
Signed-off-by: Tomas Longo <tomas.longo@sap.com>
Signed-off-by: Tomas Longo <tlongo@sternad.de>
Signed-off-by: Tomas Longo <tlongo@sternad.de>
Signed-off-by: Tomas Longo <tlongo@sternad.de>
Signed-off-by: Tomas Longo <tlongo@sternad.de>
Signed-off-by: Tomas Longo <tlongo@sternad.de>
Signed-off-by: Tomas Longo <tlongo@sternad.de>
87630c2 to
655abe4
Compare
KarstenSchnitter
left a comment
There was a problem hiding this comment.
LGTM. Changes to the other sources are going to be provided along the implementation in the OTelTraceSource.
Signed-off-by: Tomas Longo <tlongo@sternad.de>
| final ConnectionConfiguration.Builder builder = new ConnectionConfiguration.Builder( | ||
| Collections.singletonList("https://127.0.0.1:9200")); | ||
| builder.withUsername("admin"); | ||
| builder.withPassword("admin"); | ||
| builder.withInsecure(true); | ||
| final RestHighLevelClient restHighLevelClient = builder.build().createClient(null); | ||
| // Wait for data to flow through pipeline and be indexed by ES | ||
| await().atLeast(3, TimeUnit.SECONDS).atMost(20, TimeUnit.SECONDS).untilAsserted( | ||
| () -> { | ||
| refreshIndices(restHighLevelClient); | ||
| final SearchRequest searchRequest = new SearchRequest(INDEX_NAME); | ||
| searchRequest.source( | ||
| SearchSourceBuilder.searchSource() | ||
| .size(100) | ||
| .fetchField(TraceGroup.TRACE_GROUP_STATUS_CODE_FIELD) | ||
| .fetchField(TraceGroup.TRACE_GROUP_END_TIME_FIELD, "strict_date_time") | ||
| .fetchField(TraceGroup.TRACE_GROUP_DURATION_IN_NANOS_FIELD) | ||
| ); | ||
| final SearchResponse searchResponse = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT); | ||
| final List<Map<String, Object>> foundSources = getSourcesFromSearchHits(searchResponse.getHits()); | ||
| assertEquals(expectedDocuments.size(), foundSources.size()); | ||
| /** | ||
| * Our raw trace prepper add more fields than the actual sent object. These are defaults from the proto. | ||
| * So assertion is done if all the expected fields exists. | ||
| * | ||
| * TODO: Can we do better? | ||
| * | ||
| */ | ||
| expectedDocuments.forEach(expectedDoc -> { | ||
| assertTrue(foundSources.stream() | ||
| .filter(i -> i.get("spanId").equals(expectedDoc.get("spanId"))) | ||
| .findFirst().get() | ||
| .entrySet().containsAll(expectedDoc.entrySet())); | ||
| }); | ||
| } | ||
| ); |
There was a problem hiding this comment.
This block sits behind the return statement. It looks like a merge error. Furthermore it causes test failures, since expectedDocuments is undefined. @TomasLongo can you have a look at this?
KarstenSchnitter
left a comment
There was a problem hiding this comment.
There seems to have been a merge error. @TomasLongo can you have a look at this again?
Signed-off-by: Tomas Longo <tlongo@sternad.de>
Signed-off-by: Tomas Longo <tlongo@sternad.de>
KarstenSchnitter
left a comment
There was a problem hiding this comment.
Thanks for all the effort. I approve this PR. @dlvenable I am going to merge this if there are no further requests from your side.
bf397f3
into
opensearch-project:main
|
@KarstenSchnitter @TomasLongo @dlvenable Looks like this broke the OTEL standard mode support because it ignore the |
Description
Introduce http endpoint to support otlp/http for the otel trace source. So far, this draft PR contains:
/opentelemetry.proto.collector.trace.v1.TraceService/ExportThis draft PR serves as starting for further discussions
Configuration of HTTP and gRPC Service
Currently, the source config is responsible for setting up a working gRPC service. Now, an HTTP Service has to be configured as well.
As far as this PR is concerned, we can assign the current config items into two groups:
unframed_requests,proto_reflection_service)compression,authentication) and should/couldThis leads to the questions how the structure of the config should look like in the future
Create distinct sections for every endpoint
Let the endpoint share as much of the current config as possible
e.g. features like
compression,authenticationDo we want to keep unframed requests
My understanding is that unframed requests enable clients to send plain http requests to the gRPC endpoint and would be rendered deprecated by this PR. However, there might be more to this feature which I'm currently unaware of.
Usage of the plugin mechanism to handle configs
The Method
HttpService.createAuthenticationProvidercontains a comment on how the handling of the basic auth config could be made less complex.The basic idea is to not treat simple configs as plugins. The overall developer experience could benefit from using the simple parsing mechanism that is already used for other configs (like RetryInfo).
Again, there could be reasons for using the plugin mechanism which I'm not aware of.
Issues Resolved
Resolves #4983
Related to #5259
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.