import sklearn creates a StreamHandler and attaches it to the sklearn logger:
|
logger.addHandler(logging.StreamHandler()) |
I'm not sure what the motivation for this is, but it's a deviation from the normal "best practices" for logging, namely that libraries should restrict themselves to issuing log messages, but let the application do all logging configuration (setting up handlers, changing logger levels, and the like). There's lots written about this elsewhere, but here's one relevant blog post: http://pieces.openpolitics.com/2012/04/python-logging-best-practices/
In practice, this caused a hard-to-diagnose bug in our IPython- and sklearn-using application (actually, in more than one such application):
- At application start time, we start an IPython kernel. That kernel swaps out
sys.stdout and sys.stderr for its own custom streams, which rely on a lot of fairly complicated machinery (extra threads, ZMQ streams, the asyncio event loop, etc.)
sklearn was imported while that IPython kernel was running.
- The log handler created at import time then picked up IPython's custom
sys.stderr stream instead of the usual one.
- At application stop time, the IPython kernel and associated machinery were stopped.
- At process exit time, the stream associated to the handler was flushed (by the
logging module's shutdown function, which is registered as an atexit handler). Because the IPython machinery was no longer active, we got a hard-to-understand traceback.
If the intent of the handler is to suppress the "No logger configured ..." messages from the std. lib., perhaps a logging.NullHandler could be used for that purpose instead? I'm happy to create a PR for this if the proposed change sounds acceptable.
import sklearncreates aStreamHandlerand attaches it to thesklearnlogger:scikit-learn/sklearn/__init__.py
Line 24 in 0eebade
I'm not sure what the motivation for this is, but it's a deviation from the normal "best practices" for logging, namely that libraries should restrict themselves to issuing log messages, but let the application do all logging configuration (setting up handlers, changing logger levels, and the like). There's lots written about this elsewhere, but here's one relevant blog post: http://pieces.openpolitics.com/2012/04/python-logging-best-practices/
In practice, this caused a hard-to-diagnose bug in our IPython- and sklearn-using application (actually, in more than one such application):
sys.stdoutandsys.stderrfor its own custom streams, which rely on a lot of fairly complicated machinery (extra threads, ZMQ streams, the asyncio event loop, etc.)sklearnwas imported while that IPython kernel was running.sys.stderrstream instead of the usual one.loggingmodule'sshutdownfunction, which is registered as anatexithandler). Because the IPython machinery was no longer active, we got a hard-to-understand traceback.If the intent of the handler is to suppress the "No logger configured ..." messages from the std. lib., perhaps a
logging.NullHandlercould be used for that purpose instead? I'm happy to create a PR for this if the proposed change sounds acceptable.