Skip to content
This repository was archived by the owner on Jun 10, 2020. It is now read-only.

Reimplement multi-logger using log4j2#68

Merged
dwijnand merged 1 commit intosbt:1.0from
eed3si9n:topic/logging
Jan 13, 2017
Merged

Reimplement multi-logger using log4j2#68
dwijnand merged 1 commit intosbt:1.0from
eed3si9n:topic/logging

Conversation

@eed3si9n
Copy link
Member

This introduces ManagedLogger, which is a wrapper around Log4j2's async logging.
Log4j2 separates the notion of logger (the code that collects events) and appender (the code that acts on events). The old code is kept around intentionally to minimize breakage during transition.

This introduces ManagedLogger, which is a wrapper around Log4j2's async logging.
Log4j2 separates the notion of logger (the code that collects events) and appender (the code that acts on events). The old code is kept around intentionally to minimize breakage during transition.
@eed3si9n eed3si9n requested a review from dwijnand January 13, 2017 07:50
@dwijnand dwijnand merged commit 5b8de3a into sbt:1.0 Jan 13, 2017
@eed3si9n eed3si9n deleted the topic/logging branch January 13, 2017 21:52
@dwijnand dwijnand removed their request for review April 17, 2017 16:38
@retronym
Copy link
Member

While profiling SBT during a warmed up, clean compile of scala/scala (under scala/scala#6256), I see 28% of samples found in a spin wait in the internals of disruptor:

image

Sort of similar to a problem described in LMAX-Exchange/disruptor#155, perhaps, although the commit that closed that issue should be part of 3.3.6.

Profiling:

$ sbt -J-XX:+UnlockCommercialFeatures -J-XX:+FlightRecorder -J-XX:FlightRecorderOptions=stackdepth=1024 -J-XX:+UnlockDiagnosticVMOptions -J-XX:+DebugNonSafepoints
sbt:root> ;clean;compile // repeat for minute or so
% jcmd <PID> JFR.start settings=profile
sbt:root> ;clean;compile // repeat for minute or so
% jcmd <PID> JFR.dump recording=1 filename=/tmp/sbt-one.jfr

@retronym
Copy link
Member

LMAX-Exchange/disruptor#157

This is a place where we have had to make an awkward trade off. One features of the Disruptor is that is can work in a manner that avoids Lock.{park,unpark}, but to do so it has to fall back to some sort of busy-wait operation. For some of the simple cases it would be possible to add a optional support for having the consumer unpack a waiting publisher thread, but for some of the more complex cases, this is not so easy (if even possible without destroying performance for the cases we really care about). I'll look at an enhancement where the user can do optionally include support for this.

However, the Disruptor was designed to work in low-latency systems, where the number busy threads tends to be smaller than the number of available cores. In these system using LockSupport.parkNanos() tends to be quite effective. If your system does not fall into that category you may want to consider a different appender in your Log4j configuration.

For the specific case of log4j. We have our own Disruptor based appender, in which we use tryNext() rather than next() and drop events (with some additional controlled logging to stderr and MBean monitoring) as we believe that stalling application flow for the sake of logging is wrong option to take. I did bring up this option with them, but they rejected our approach.

Just a couple of corrections around LockSupport.parkNanos(1). While the code suggests that it would park for a nanosecond, the reality on most operating system is that the pause is much longer than that. On Linux it tends to be around 60µs on Windows more like 200µs (depending on version).

As to the "LMAX disruptor is doing is exactly what the documentation (...) says NOT to do". Strong recommendations from SUN/Oracle is not always a reason not to do something, we use the Unsafe after all.

@eed3si9n
Copy link
Member Author

Thank you for this report!
I think it's good to go back and review my usage of log4j. I am suspecting part of the problem might be introduced by the usage of my own appender combined with sbt logging a lot of thing into the file for last usage.

@retronym
Copy link
Member

retronym commented Feb 17, 2018

On a second look, I think I misread the profile. The calls to Unsafe.park should release the current thread from being scheduled.

Log4J2 uses disuptor's TimeoutBlockingWaitStrategy by default (contrast with BusySpinWaitStrategy)

The results shown by FlightRecorder/JMC are a bit confusing.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants