Allow additional JSON log fields via SPI by tvernum · Pull Request #106980 · elastic/elasticsearch

tvernum · 2024-04-02T04:22:09Z

This adds a new SPI based LoggingDataProvider service that can be
implemented in order to add new fields to the main JSON log

This adds a new SPI based `LoggingDataProvider` service that can be implemented in order to add new fields to the main JSON log

elasticsearchmachine · 2024-04-02T04:22:33Z

Pinging @elastic/es-core-infra (Team:Core/Infra)

pgomulka

LGTM, but I left one comment to reflect on with @elastic/es-core-infra

pgomulka · 2024-04-02T13:18:39Z

server/src/main/java/org/elasticsearch/common/logging/LoggingDataProvider.java

+ *
+ * @see DynamicContextDataProvider
+ */
+public interface LoggingDataProvider {


I wonder if we want to expose this via libs:logging instead (which is kind of a stable plugin logging api) ?
this would allow plugins to implement the 'prefix logger' style of feature, so one less thing that we would have to support at some point in the future
wdyt @rjernst ?

We may want to do that, but I suggest discussing it as a followup.

rjernst

One hesitation I have with this is exposing it to plugin developers. Could we limit it to ES modules at first (using qualified exports)? That would mean it needs to be a in a different package, say o.e.logging.internal, or it could go into the existing plugin internal package that is already restricted.

rjernst · 2024-04-02T14:13:14Z

server/src/main/java/org/elasticsearch/common/logging/DynamicContextDataProvider.java

+    }
+
+    @Override
+    public Map<String, String> supplyContextData() {


As I understand it this is called on every logging call. Is there any possibility to cache the generated map?

Assuming we want this system to be able to log things that change (and the logging system doesn't know when they change) such as active username / trace-id, etc (which is why I want it), then caching is tricky.

Some options that might work

Always reuse the last map, and make each provider responsible for clearing any keys that ought not to be set, as well as populating keys that should be. I'm not sure about the thread safety of that, but we can experiment if needed. It has the risk that we might log something inaccurate if one of the providers fails to clear a field.

As above (1), but set all the values to null before calling the providers. That increases safety at the expense of CPU time.

Create the new map with the same keys as the last map. Like (2) but easier to make thread safe, but still has high object churn (with fewer reallocations)

Create the new map with the same size as the last map. That would reduce internal reallocations in the map, like (3) but not really cache anything.

Use a custom Map implementation (or a custom StringMap which is what log4j really uses here), with one of the following behaviours:

Have each provider return their own map, and our custom map is just a merging view over them. That means providers that have constant data can return a constant map

Extend the LoggingDataProvider so that it returns the set of keys as constants (a new method) and we can define a map implementation that takes advantage of the known set of keys to reduce memory usage/churn.

I suspect having a predicted size and allocating the map to be that size is the only option that is

simple enough to implement

thread safe

sufficiently performant to justify

I implemented version that uses a fixed-keys map here 8499a53
I haven't actually tested the performance gains that might come from that, but it seems like a viable option if we want to pursue it.

thinking out loud..
can we instead of having DynamicContextDataProvider implement a ContextDataProvider directly in every plugin?
do we have to aggregate LoggingDataProvider in DynamicContextDatProvider? it feels like log4j is already doing this as it expects multiple ContextDataProviders

I don't think there's any way to do that.
Log4j won't load ContextDataProvider services from our plugins because they aren't on the classpath when we initialise log4j, and as far as I know, there is no way to reinitialise the services after we've loaded plugins & modules.
We need some sort of bridge from what log4j can see at init-time into our module structure.

@rjernst Did you want to explore any of the other possible caching options?

I've implemented a cached "largest previous map size" so that we can allocate the new map to be the same size, but we could explore options to do more than that if you would like.

rjernst · 2024-04-02T14:14:18Z

server/src/main/java/org/elasticsearch/common/logging/LoggingDataProvider.java

+ *
+ * @see DynamicContextDataProvider
+ */
+public interface LoggingDataProvider {


We may want to do that, but I suggest discussing it as a followup.

rjernst · 2024-04-02T14:17:39Z

server/src/main/java/org/elasticsearch/node/NodeConstruction.java


        FeatureService featureService = new FeatureService(pluginsService.loadServiceProviders(FeatureSpecification.class));

+        DynamicContextDataProvider.setDataProviders(pluginsService.loadServiceProviders(LoggingDataProvider.class));


This seems like an arbitrary place to set the providers. Could it be done right after initializing plugins, line ~253 after createEnvironment?

It was definitely arbitrary. I'll move it.

tvernum · 2024-04-03T07:23:55Z

qa/logging-spi/build.gradle

+
+tasks.named("javadoc").configure {
+  // There seems to be some problem generating javadoc on a QA project that has a module definition
+  enabled = false


@mark-vieira I had to disable javadoc on this QA project because it couldn't find the org.elasticsearch.server module while trying to parse module-info.java.
It looks like this is the first QA project with a module-info, so took the easy option to resolve it.

pgomulka

LGTM

rjernst

LGTM. We can start here and always change it since this is internal.

tvernum · 2024-04-11T01:08:16Z

@elasticmachine update branch

This adds a new SPI based `LoggingDataProvider` service that can be implemented in order to add new fields to the main JSON log

tvernum added 4 commits April 2, 2024 12:43

Allow additional JSON log fields via SPI

c4669b2

This adds a new SPI based `LoggingDataProvider` service that can be implemented in order to add new fields to the main JSON log

Add IT for dynamic logging data

f1161c0

Spotless

8ce43a4

Add javadoc

d810d9f

tvernum added >non-issue :Core/Infra/Logging Log management and logging utilities v8.14.0 labels Apr 2, 2024

tvernum requested a review from pgomulka April 2, 2024 04:22

tvernum requested review from a team as code owners April 2, 2024 04:22

elasticsearchmachine added the Team:Core/Infra Meta label for core/infra team label Apr 2, 2024

tvernum added 2 commits April 2, 2024 15:23

Remove default-distribution dependency

b9c3893

Switch from SetOne to AtomicRef

07089c6

pgomulka reviewed Apr 2, 2024

View reviewed changes

rjernst reviewed Apr 2, 2024

View reviewed changes

mark-vieira approved these changes Apr 2, 2024

View reviewed changes

tvernum added 3 commits April 3, 2024 15:14

Address feedback

6d7be88

Merge branch 'main' into spi-log-fields

612c528

Fix QA test

3a20f26

tvernum commented Apr 3, 2024

View reviewed changes

tvernum requested a review from rjernst April 4, 2024 05:12

pgomulka approved these changes Apr 4, 2024

View reviewed changes

rjernst approved these changes Apr 10, 2024

View reviewed changes

Merge branch 'main' into spi-log-fields

97fdf09

tvernum added the auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) label Apr 11, 2024

elasticsearchmachine merged commit 36d5282 into elastic:main Apr 11, 2024

tvernum deleted the spi-log-fields branch April 11, 2024 02:14


		FeatureService featureService = new FeatureService(pluginsService.loadServiceProviders(FeatureSpecification.class));

		DynamicContextDataProvider.setDataProviders(pluginsService.loadServiceProviders(LoggingDataProvider.class));

Conversation

tvernum commented Apr 2, 2024

Uh oh!

elasticsearchmachine commented Apr 2, 2024

Uh oh!

pgomulka left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rjernst left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tvernum Apr 4, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tvernum Apr 3, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pgomulka left a comment

Choose a reason for hiding this comment

Uh oh!

rjernst left a comment

Choose a reason for hiding this comment

Uh oh!

tvernum commented Apr 11, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

tvernum Apr 4, 2024 •

edited

Loading

tvernum Apr 3, 2024 •

edited

Loading