Allow additional JSON log fields via SPI#106980
Allow additional JSON log fields via SPI#106980elasticsearchmachine merged 10 commits intoelastic:mainfrom
Conversation
This adds a new SPI based `LoggingDataProvider` service that can be implemented in order to add new fields to the main JSON log
|
Pinging @elastic/es-core-infra (Team:Core/Infra) |
pgomulka
left a comment
There was a problem hiding this comment.
LGTM, but I left one comment to reflect on with @elastic/es-core-infra
| * | ||
| * @see DynamicContextDataProvider | ||
| */ | ||
| public interface LoggingDataProvider { |
There was a problem hiding this comment.
I wonder if we want to expose this via libs:logging instead (which is kind of a stable plugin logging api) ?
this would allow plugins to implement the 'prefix logger' style of feature, so one less thing that we would have to support at some point in the future
wdyt @rjernst ?
There was a problem hiding this comment.
We may want to do that, but I suggest discussing it as a followup.
rjernst
left a comment
There was a problem hiding this comment.
One hesitation I have with this is exposing it to plugin developers. Could we limit it to ES modules at first (using qualified exports)? That would mean it needs to be a in a different package, say o.e.logging.internal, or it could go into the existing plugin internal package that is already restricted.
| } | ||
|
|
||
| @Override | ||
| public Map<String, String> supplyContextData() { |
There was a problem hiding this comment.
As I understand it this is called on every logging call. Is there any possibility to cache the generated map?
There was a problem hiding this comment.
Assuming we want this system to be able to log things that change (and the logging system doesn't know when they change) such as active username / trace-id, etc (which is why I want it), then caching is tricky.
Some options that might work
- Always reuse the last map, and make each provider responsible for clearing any keys that ought not to be set, as well as populating keys that should be. I'm not sure about the thread safety of that, but we can experiment if needed. It has the risk that we might log something inaccurate if one of the providers fails to clear a field.
- As above (1), but set all the values to null before calling the providers. That increases safety at the expense of CPU time.
- Create the new map with the same keys as the last map. Like (2) but easier to make thread safe, but still has high object churn (with fewer reallocations)
- Create the new map with the same size as the last map. That would reduce internal reallocations in the map, like (3) but not really cache anything.
- Use a custom
Mapimplementation (or a customStringMapwhich is what log4j really uses here), with one of the following behaviours:- Have each provider return their own map, and our custom map is just a merging view over them. That means providers that have constant data can return a constant map
- Extend the
LoggingDataProviderso that it returns the set of keys as constants (a new method) and we can define a map implementation that takes advantage of the known set of keys to reduce memory usage/churn.
I suspect having a predicted size and allocating the map to be that size is the only option that is
- simple enough to implement
- thread safe
- sufficiently performant to justify
There was a problem hiding this comment.
I implemented version that uses a fixed-keys map here 8499a53
I haven't actually tested the performance gains that might come from that, but it seems like a viable option if we want to pursue it.
There was a problem hiding this comment.
thinking out loud..
can we instead of having DynamicContextDataProvider implement a ContextDataProvider directly in every plugin?
do we have to aggregate LoggingDataProvider in DynamicContextDatProvider? it feels like log4j is already doing this as it expects multiple ContextDataProviders
There was a problem hiding this comment.
I don't think there's any way to do that.
Log4j won't load ContextDataProvider services from our plugins because they aren't on the classpath when we initialise log4j, and as far as I know, there is no way to reinitialise the services after we've loaded plugins & modules.
We need some sort of bridge from what log4j can see at init-time into our module structure.
There was a problem hiding this comment.
@rjernst Did you want to explore any of the other possible caching options?
I've implemented a cached "largest previous map size" so that we can allocate the new map to be the same size, but we could explore options to do more than that if you would like.
| * | ||
| * @see DynamicContextDataProvider | ||
| */ | ||
| public interface LoggingDataProvider { |
There was a problem hiding this comment.
We may want to do that, but I suggest discussing it as a followup.
|
|
||
| FeatureService featureService = new FeatureService(pluginsService.loadServiceProviders(FeatureSpecification.class)); | ||
|
|
||
| DynamicContextDataProvider.setDataProviders(pluginsService.loadServiceProviders(LoggingDataProvider.class)); |
There was a problem hiding this comment.
This seems like an arbitrary place to set the providers. Could it be done right after initializing plugins, line ~253 after createEnvironment?
There was a problem hiding this comment.
It was definitely arbitrary. I'll move it.
|
|
||
| tasks.named("javadoc").configure { | ||
| // There seems to be some problem generating javadoc on a QA project that has a module definition | ||
| enabled = false |
There was a problem hiding this comment.
@mark-vieira I had to disable javadoc on this QA project because it couldn't find the org.elasticsearch.server module while trying to parse module-info.java.
It looks like this is the first QA project with a module-info, so took the easy option to resolve it.
rjernst
left a comment
There was a problem hiding this comment.
LGTM. We can start here and always change it since this is internal.
|
@elasticmachine update branch |
This adds a new SPI based `LoggingDataProvider` service that can be implemented in order to add new fields to the main JSON log
This adds a new SPI based
LoggingDataProviderservice that can beimplemented in order to add new fields to the main JSON log