Skip to content

obsservice: turn the ObsService into an OTLP server #95632

Merged
craig[bot] merged 2 commits intocockroachdb:masterfrom
andreimatei:obsservice.otlp
Feb 7, 2023
Merged

obsservice: turn the ObsService into an OTLP server #95632
craig[bot] merged 2 commits intocockroachdb:masterfrom
andreimatei:obsservice.otlp

Conversation

@andreimatei
Copy link
Copy Markdown
Contributor

@andreimatei andreimatei commented Jan 20, 2023

This patch makes the Obs Service implement the OTLP Logs gRPC service.
This reverses the connection direction for exporting events out of CRDB:
now CRDB connects out to the Obs Service through a new --obsservice-addr
flag. The Obs Service gets a new --otlp-addr flag taking the
host:port on which to expose the service.

By using standard OTLP we can have the data flow through an
OpenTelemetry Collector before going to the Obs Service. This makes the
Obs Service topology more flexible (since we can rely on the collector
to be deployed widely and funnel the data), and gives the collector the
opportunity to tee the data to other destinations.

Release note: None
Epic: CRDB-23641

@andreimatei andreimatei requested review from a team as code owners January 20, 2023 22:45
@andreimatei andreimatei requested review from cucaroach, herkolategan and srosenberg and removed request for a team January 20, 2023 22:45
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@andreimatei
Copy link
Copy Markdown
Contributor Author

First commit is #93733

@andreimatei andreimatei changed the title obsservice: add logs_service proto for OTLP collector obsservice: turn the ObsService into an OTLP server Jan 22, 2023
@andreimatei andreimatei force-pushed the obsservice.otlp branch 2 times, most recently from db59d4d to 6780ed0 Compare January 22, 2023 19:23
find $DEST_DIR -type f -name "*.proto" -exec sed -i "s/option go_package = \"go.opentelemetry.io\/proto\/otlp\/.*\/v1\"/option go_package = \"v1\"/" {} + ;
if [ `uname` = 'Darwin' ]; then
# for OSX
find $DEST_DIR -type f -name "*.proto" -exec sed -i '' -e "s/option go_package = \"go.opentelemetry.io\/proto\/otlp\/.*\/v1\"/option go_package = \"v1\"/" {} + ;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/ is perhaps not the best choice for sed separator here, I'd use |

Copy link
Copy Markdown
Contributor Author

@andreimatei andreimatei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First commit is #93733

Now all commits are original.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @aadityasondhi, @abarganier, @cucaroach, @dhartunian, @herkolategan, and @srosenberg)


pkg/obsservice/obspb/update-opentelemetry-proto.sh line 85 at r3 (raw file):

Previously, cucaroach (Tommy Reilly) wrote…

/ is perhaps not the best choice for sed separator here, I'd use |

Ack, This change is in #93733, which merged. I'll probably leave it alone.

Copy link
Copy Markdown
Collaborator

@dhartunian dhartunian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: just some minor comments/nits. excited to see where this leads!

Reviewed 6 of 27 files at r5, 9 of 15 files at r6, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @aadityasondhi, @abarganier, @andreimatei, and @cucaroach)


pkg/base/test_server_args.go line 170 at r6 (raw file):

	// ObsServiceAddr is the address to which events will be exported over OTLP.
	// If empty, exporting events is inhibited.
	ObsServiceAddr string

Minor: I'm wondering if this should also be an OTLPSinkAddr or something. could encourage folks to integrate with their own infra. Either way, maybe best not to encourage that just yet.


pkg/obs/event_exporter.go line 251 at r6 (raw file):

			select {
			case <-ctx.Done():
				//	// We'll return after flushing everything.

nit :extra //


pkg/obsservice/obslib/ingest/ingest.go line 45 at r6 (raw file):

) (*logspb.ExportLogsServiceResponse, error) {
	_ = persistEvents(ctx, request.ResourceLogs, e.db)
	return &logspb.ExportLogsServiceResponse{}, nil

why don't we care about the error here?


pkg/server/tenant.go line 1017 at r6 (raw file):

	remoteFlowRunner := flowinfra.NewRemoteFlowRunner(baseCfg.AmbientCtx, stopper, &remoteFlowRunnerAcc)

	// Create the EventServer. It will be made operational later, after the

nit: update comment.


pkg/sql/event_log.go line 699 at r6 (raw file):

		// others serialize their UUIDs, but I went with the network byte order.
		binary.BigEndian.PutUint64(traceID[:], uint64(sp.TraceID()))
		binary.BigEndian.PutUint64(spanID[:], uint64(sp.SpanID()))

does this change have impact on existing deployments? I guess these are random IDs anyway, but just want to check. I don't think we read these back anywhere.

Copy link
Copy Markdown
Contributor

@abarganier abarganier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very cool - :lgtm: modulo a couple minor nits.

Reviewable status: :shipit: complete! 2 of 0 LGTMs obtained (waiting on @aadityasondhi, @andreimatei, and @cucaroach)


pkg/obs/event_exporter.go line 276 at r6 (raw file):

				s.buf.mu.sizeBytes -= sizeBytes
				msgSize += sizeBytes
				req.ResourceLogs[0].ScopeLogs = append(req.ResourceLogs[0].ScopeLogs,

nit: would it make sense to pre-allocate ScopeLogs to len(s.buf.mu.events)?

Code quote:

.ScopeLogs

pkg/obsservice/obslib/ingest/ingest.go line 35 at r6 (raw file):

// SetDB sets the database connection that the ingester will use to persist the
// events it receives. It needs to be called before Export().

nit: should we check the e.db is defined in Export() before calling persistEvents()? Seems like we might hit an NPE panic otherwise

Code quote:

It needs to be called before Export().

Copy link
Copy Markdown
Contributor Author

@andreimatei andreimatei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 2 of 0 LGTMs obtained (waiting on @aadityasondhi, @cucaroach, and @dhartunian)


pkg/base/test_server_args.go line 170 at r6 (raw file):

Previously, dhartunian (David Hartunian) wrote…

Minor: I'm wondering if this should also be an OTLPSinkAddr or something. could encourage folks to integrate with their own infra. Either way, maybe best not to encourage that just yet.

yeah... I figured that putting ObsService in these names is more suggestive about what this data is about.
If anything, I think the name of the flag that's externally visible is more debatable... But I dunno, I'd keep "obs service" for now at least.


pkg/obs/event_exporter.go line 251 at r6 (raw file):

Previously, dhartunian (David Hartunian) wrote…

nit :extra //

done


pkg/obs/event_exporter.go line 276 at r6 (raw file):

Previously, abarganier (Alex Barganier) wrote…

nit: would it make sense to pre-allocate ScopeLogs to len(s.buf.mu.events)?

done


pkg/obsservice/obslib/ingest/ingest.go line 35 at r6 (raw file):

Previously, abarganier (Alex Barganier) wrote…

nit: should we check the e.db is defined in Export() before calling persistEvents()? Seems like we might hit an NPE panic otherwise

done


pkg/obsservice/obslib/ingest/ingest.go line 45 at r6 (raw file):

Previously, dhartunian (David Hartunian) wrote…

why don't we care about the error here?

added logging
As far as returning it, you're not really supposed to return errors from this API according to OTLP.


pkg/server/tenant.go line 1017 at r6 (raw file):

Previously, dhartunian (David Hartunian) wrote…

nit: update comment.

done


pkg/sql/event_log.go line 699 at r6 (raw file):

Previously, dhartunian (David Hartunian) wrote…

does this change have impact on existing deployments? I guess these are random IDs anyway, but just want to check. I don't think we read these back anywhere.

Nobody cares about these so far.

This patch switches the main obsservice cobra command to return errors
instead of panicking internally. This is nicer since Cobra prints the
error.

Release note: None
Epic: None
This patch makes the Obs Service implement the OTLP Logs gRPC service.
This reverses the connection direction for exporting events out of CRDB:
now CRDB connects out to the Obs Service through a new --obsservice-addr
flag. The Obs Service gets a new `--otlp-addr` flag taking the
host:port on which to expose the service.

By using standard OTLP we can have the data flow through an
OpenTelemetry Collector before going to the Obs Service. This makes the
Obs Service topology more flexible (since we can rely on the collector
to be deployed widely and funnel the data), and gives the collector the
opportunity to tee the data to other destinations.

Release note: None
Epic: CRDB-16791
Fixes: CRDB-23641
@andreimatei
Copy link
Copy Markdown
Contributor Author

bors r+

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Feb 7, 2023

Build succeeded:

@craig craig bot merged commit 247a543 into cockroachdb:master Feb 7, 2023
@andreimatei andreimatei deleted the obsservice.otlp branch February 22, 2023 20:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants