Skip to content

pkg/util/log: add otlp log sink#148525

Merged
craig[bot] merged 1 commit intocockroachdb:masterfrom
TheComputerM:otlp-sink
Jul 1, 2025
Merged

pkg/util/log: add otlp log sink#148525
craig[bot] merged 1 commit intocockroachdb:masterfrom
TheComputerM:otlp-sink

Conversation

@TheComputerM
Copy link
Copy Markdown
Contributor

@TheComputerM TheComputerM commented Jun 18, 2025

OpenTelemetry is now an industry standard for o11y and CRDB
should support its logging protocol so that it works
seamlessly with OTEL compatible tooling.

This commit implements the log sink functionality, it uses
gRPC to transmit logs in the OTEL-standard protobuf format.
This sink can be used to ship logs on OTEL compatible targets
like otel-collector, datadog-agent or even directly to
platforms like datadog and Loki.

Parsing the config for the otlp log sink was done in
a previous commit.

Resolves: #143049

Release note (general change): Added otlp log sink
that can export logs to OpenTelemetry-compatible
targets over gRPC.

@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@TheComputerM TheComputerM force-pushed the otlp-sink branch 3 times, most recently from 0d3aa00 to 0f5c4de Compare June 19, 2025 05:28
@TheComputerM TheComputerM marked this pull request as ready for review June 19, 2025 10:51
@TheComputerM TheComputerM requested review from a team as code owners June 19, 2025 10:51
@TheComputerM TheComputerM requested review from Abhinav1299, aa-joshi, angles-n-daemons and arjunmahishi and removed request for a team June 19, 2025 10:51
Copy link
Copy Markdown
Contributor

@Abhinav1299 Abhinav1299 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can squash your commits to just 1 commit for this PR.


// converts the raw bytes into OTEL log records
func extractRecordsToOTLP(b []byte) []*lpb.LogRecord {
bodies := bytes.Split(b, []byte("\n"))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can use early return before splitting bytes on new lines.

if len(b) == 0 {
    return nil 
}

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might not really be needed as there will be a buffered sink attached to the otlp sink most of the time and buffered sink only writes if the buffer is almost full (i.e len(b) > 0), other sinks also don't perform this check.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might want to trim the leading/trailing spaces though. To avoid empty lines.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The check for empty lines is happening here: https://github.com/cockroachdb/cockroach/pull/148525/files#diff-5a40209ba50f75cc7a515ff1c64f35c38afdf770e9cf5d8c2736e7e04e8417e6R83-R84

body = bytes.TrimSpace(body)
if len(body) > 0 {
    // append log records to message

Copy link
Copy Markdown
Contributor Author

@TheComputerM TheComputerM left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will fixup after code is reviewed, (don't wanna git force push everytime)

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @aa-joshi, @Abhinav1299, @angles-n-daemons, and @arjunmahishi)


// converts the raw bytes into OTEL log records
func extractRecordsToOTLP(b []byte) []*lpb.LogRecord {
bodies := bytes.Split(b, []byte("\n"))
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might not really be needed as there will be a buffered sink attached to the otlp sink most of the time and buffered sink only writes if the buffer is almost full (i.e len(b) > 0), other sinks also don't perform this check.

@arjunmahishi arjunmahishi requested a review from Copilot June 20, 2025 09:00
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces an OpenTelemetry (OTLP) log sink that exports logs over gRPC in the OTLP protobuf format.

  • Implements the otlpSink type and export logic in otlp_client.go
  • Integrates the new sink in ApplyConfig and adds configuration handling in flags.go
  • Adds end-to-end tests (otlp_client_test.go), updates BUILD files, and adjusts lint rules

Reviewed Changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
pkg/util/log/sinks.go Registered otlpSink as a logSink implementation
pkg/util/log/otlp_client.go Added otlpSink type, newOtlpSink, record extraction, and output logic
pkg/util/log/otlp_client_test.go Added tests for OTLP client exporting logs to a mock gRPC server
pkg/util/log/flags.go Added newOtlpSinkInfo, close logic in ApplyConfig
pkg/util/log/BUILD.bazel Included new sources and dependencies for OTLP client
pkg/testutils/lint/lint_test.go Excluded otlp_client_test.go from lint checks

Copy link
Copy Markdown
Contributor

@arjunmahishi arjunmahishi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work! 💯

I've done one pass of the code. I haven't run it yet. I'll do that and let you know if I have more comments.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @aa-joshi, @Abhinav1299, @angles-n-daemons, and @TheComputerM)


-- commits line 5 at r1:
Avoid statements like "is more efficient" unless it's clear from the message "how it's more efficient".

But in this case, I think we can avoid this statement altogether. This is not a more efficient way. This is another way. Useful for orgs that use otel-compatible tooling.


-- commits line 10 at r1:
NIT:
Suggesting a different wording. Trying to avoid adjectives and also make it explicit that this can be used along with agents too.

Suggestion:

  This sink can be used to ship logs on OTEL compatible
  targets like otel-collector, datadog-agent or even directky to platforms like datadog and Loki.

-- commits line 11 at r1:
Same here. I think we should remove "without much overhead".


pkg/util/log/otlp_client_test.go line 86 at r1 (raw file):

	// Send a log event on the OPS channel.
	Ops.Infof(context.Background(), "hello world")

Can you add more variants here?
Different levels, channels, tags, redaction etc.

You could make it a "table driven test" or "data driven test"


pkg/util/log/otlp_client_test.go line 99 at r1 (raw file):

			t.Fatalf("unable to decode json: %v", err)
		}
		require.Equal(t, "hello world", info["message"])

Also assert some of the import properties like level


pkg/util/log/otlp_client.go line 39 at r1 (raw file):

		// Default to using the host's root CA.
		dialOpts = append(dialOpts, grpc.WithTransportCredentials(credentials.NewTLS(nil)))
	}

NIT: Avoid using else when you can. It's easier to read code when there is less nesting.

Suggestion:

	// Default to using the host's root CA.
	transportCreds := grpc.WithTransportCredentials(credentials.NewTLS(nil))
	if *config.Insecure {
		transportCred = grpc.WithTransportCredentials(insecure.NewCredentials())
	}
	dialOpts = append(dialOpts, transportCred)

pkg/util/log/otlp_client.go line 104 at r1 (raw file):

					Attributes: []*cpb.KeyValue{
						{
							Key:   "service.name",

Make untyped strings like this a global constants on top of the file. That way, avoids potential. duplication in the future.


pkg/util/log/otlp_client.go line 116 at r1 (raw file):

					{
						InstrumentationLibrary: &cpb.InstrumentationLibrary{
							Name:    "cockroachdb.logger",

NIT:
This is not the logger version right? it's the CRDB version.
Also, I think there is

Suggestion:

cockroachdb.version

pkg/util/log/flags.go line 465 at r1 (raw file):

}

func newOtlpSinkInfo(c logconfig.OtlpSinkConfig) (*sinkInfo, error) {

NIT: Abbreviations should be in upper case. I think there are other instances of Otlp too.

otlp is ok (when everything is lower case). But if you are making the O upper case, make that whole thing uppercase OTLP

Suggestion:

newOTLPSinkInfo


// converts the raw bytes into OTEL log records
func extractRecordsToOTLP(b []byte) []*lpb.LogRecord {
bodies := bytes.Split(b, []byte("\n"))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might want to trim the leading/trailing spaces though. To avoid empty lines.

@TheComputerM TheComputerM requested a review from a team as a code owner June 23, 2025 05:12
Copy link
Copy Markdown
Contributor Author

@TheComputerM TheComputerM left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @aa-joshi, @Abhinav1299, @angles-n-daemons, and @arjunmahishi)


pkg/util/log/flags.go line 465 at r1 (raw file):

Previously, arjunmahishi (Arjun Mahishi) wrote…

NIT: Abbreviations should be in upper case. I think there are other instances of Otlp too.

otlp is ok (when everything is lower case). But if you are making the O upper case, make that whole thing uppercase OTLP

Done.


pkg/util/log/otlp_client.go line 116 at r1 (raw file):

Previously, arjunmahishi (Arjun Mahishi) wrote…

NIT:
This is not the logger version right? it's the CRDB version.
Also, I think there is

The InstrumentationLibrary is used to specify which library/sdk generated the logs, if we had used the OTEL sdk, the field would look like {Name: "OTel Go SDK", Version: "1.37.0"} (where 1.37.0 would be the version of the SDK). Since we don't use any library for generating the logs we can remove this field altogether.


pkg/util/log/otlp_client_test.go line 99 at r1 (raw file):

Previously, arjunmahishi (Arjun Mahishi) wrote…

Also assert some of the import properties like level

Done.

Copy link
Copy Markdown
Contributor Author

@TheComputerM TheComputerM left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @aa-joshi, @Abhinav1299, @angles-n-daemons, and @arjunmahishi)


pkg/util/log/otlp_client.go line 104 at r1 (raw file):

Previously, arjunmahishi (Arjun Mahishi) wrote…

Make untyped strings like this a global constants on top of the file. That way, avoids potential. duplication in the future.

Done.

Copy link
Copy Markdown
Contributor Author

@TheComputerM TheComputerM left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @aa-joshi, @Abhinav1299, @angles-n-daemons, and @arjunmahishi)


pkg/util/log/otlp_client.go line 39 at r1 (raw file):

Previously, arjunmahishi (Arjun Mahishi) wrote…

NIT: Avoid using else when you can. It's easier to read code when there is less nesting.

I ideally wouldn't want to do that in this case because it executes the function for the default value which needlessly allocate that variable even if it isn't going to be used.

Copy link
Copy Markdown
Contributor Author

@TheComputerM TheComputerM left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @aa-joshi, @Abhinav1299, @angles-n-daemons, and @arjunmahishi)


-- commits line 5 at r1:

Previously, arjunmahishi (Arjun Mahishi) wrote…

Avoid statements like "is more efficient" unless it's clear from the message "how it's more efficient".

But in this case, I think we can avoid this statement altogether. This is not a more efficient way. This is another way. Useful for orgs that use otel-compatible tooling.

Done.


-- commits line 11 at r1:

Previously, arjunmahishi (Arjun Mahishi) wrote…

Same here. I think we should remove "without much overhead".

Done.

@TheComputerM TheComputerM force-pushed the otlp-sink branch 3 times, most recently from 6b5627a to ae5058c Compare June 26, 2025 10:25
@TheComputerM TheComputerM force-pushed the otlp-sink branch 3 times, most recently from 2b06d15 to 2d97fc7 Compare June 30, 2025 08:12
Copy link
Copy Markdown
Contributor

@arjunmahishi arjunmahishi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @aa-joshi, @Abhinav1299, @angles-n-daemons, and @TheComputerM)


pkg/util/log/logconfig/config.go line 594 at r15 (raw file):

type OTLPDefaults struct {
	// Disables transport security for the underlying gRPC connection.
	Insecure *bool `yaml:",omitempty"`

Do we have a concrete example for what the customer should expect when this is not set to true?

If not, let's remove this option to avoid confusion. We can implement auth as a feature at a future point when a customer asks for it. I would expect the OTLP sink to always be used along with a tool like otel-collector or datadog-agent which runs within the same network/vpc.


pkg/util/log/flags.go line 390 at r15 (raw file):

			continue
		}
		optlSinkInfo, err := newOTLPSinkInfo(*fc)

optlSinkInfo -> otlpSinkInfo

image.png

Copy link
Copy Markdown
Contributor Author

@TheComputerM TheComputerM left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @aa-joshi, @Abhinav1299, @angles-n-daemons, and @arjunmahishi)


pkg/util/log/flags.go line 390 at r15 (raw file):

Previously, arjunmahishi (Arjun Mahishi) wrote…

optlSinkInfo -> otlpSinkInfo

image.png

Done.


pkg/util/log/logconfig/config.go line 594 at r15 (raw file):

Previously, arjunmahishi (Arjun Mahishi) wrote…

Do we have a concrete example for what the customer should expect when this is not set to true?

If not, let's remove this option to avoid confusion. We can implement auth as a feature at a future point when a customer asks for it. I would expect the OTLP sink to always be used along with a tool like otel-collector or datadog-agent which runs within the same network/vpc.

Removed the option.


pkg/util/log/otlp_client_test.go line 86 at r1 (raw file):

Previously, arjunmahishi (Arjun Mahishi) wrote…

Can you add more variants here?
Different levels, channels, tags, redaction etc.

You could make it a "table driven test" or "data driven test"

Done.

OpenTelemetry is now an industry standard for o11y and CRDB
should support its logging protocol so that it works
seamlessly with OTEL compatible tooling.

This commit implements the log sink functionality, it uses
gRPC to transmit logs in the OTEL-standard protobuf format.
This sink can be used to ship logs on OTEL compatible targets
like otel-collector, datadog-agent or even directly to
platforms like datadog and Loki.

Parsing the config for the otlp log sink was done in
a previous commit. Removed the insecure option from the
otlp sink config.

Resolves: cockroachdb#143049

Release note (general change): Added otlp log sink
that can export logs to OpenTelemetry-compatible
targets over gRPC.
Copy link
Copy Markdown
Contributor

@arjunmahishi arjunmahishi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @aa-joshi, @Abhinav1299, and @angles-n-daemons)

@TheComputerM
Copy link
Copy Markdown
Contributor Author

bors r+

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Jul 1, 2025

Build failed:

@TheComputerM
Copy link
Copy Markdown
Contributor Author

bors retry

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Jul 1, 2025

@craig craig bot merged commit ee02ca6 into cockroachdb:master Jul 1, 2025
22 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

pkg/util/log: Add support for OTLP sink

5 participants