kernel: use structured logging and simplify logging interface#34374
kernel: use structured logging and simplify logging interface#34374stickies-v wants to merge 15 commits intobitcoin:masterfrom
Conversation
|
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. Code Coverage & BenchmarksFor details see: https://corecheck.dev/bitcoin/bitcoin/pulls/34374. ReviewsSee the guideline for information on the review process.
If your review is incorrectly listed, please copy-paste ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
e1a58bc to
446f523
Compare
|
🚧 At least one of the CI tasks failed. HintsTry to run the tests locally, according to the documentation. However, a CI failure may still
Leave a comment here, if you need help tracking down a confusing failure. |
446f523 to
c20ff4c
Compare
c20ff4c to
0d075e3
Compare
src/util/log.h
Outdated
| //! Type for opaque handles returned by RegisterCallback(), used to unregister. | ||
| using CallbackHandle = std::list<Callback>::iterator; | ||
| //! Type for predicates called before logging; returns true if entry should be dispatched. | ||
| using FilterFunc = std::function<bool(Level level, uint64_t category)>; |
There was a problem hiding this comment.
Jumping through a std::function when we could just be checking an atomic bitfield doesn't seem like a good approach.
There was a problem hiding this comment.
when we could just be checking an atomic bitfield
I think that's only possible if we either 1) have util::log::Dispatcher expose all of BCLog::Logger's category/level setting logic, or 2) make Dispatcher::WillLog virtual and let each logging sink create its own Dispatcher subtype. Were you thinking of another approach?
I don't think either approach is warranted to justify the minimal overhead that std::function incurs. 1) would just make Dispatcher's scope way too big in my view, and 2) would require inheritance and still have some (although probably less) overhead from the virtual table lookup.
There was a problem hiding this comment.
I don't know that Dispatcher needs to make the category/levels modifiable -- that seems like something the g_dispatcher implementation could handle via marking them as protected in Dispatcher. They should be directly testable via Dispatcher, so that disabled debug/trace categories are trivial to test, without needing indirection.
I think your latest implementation (a4e8f3c) is dangerously wrong, btw -- LogInfo() etc currently always evaluate their arguments, whereas you're making that conditional on LogAcceptCategory which calls WillLog which is conditional on Enabled().
There was a problem hiding this comment.
I think your latest implementation (a4e8f3c) is dangerously wrong, btw --
LogInfo()etc currently always evaluate their arguments, whereas you're making that conditional onLogAcceptCategorywhich callsWillLogwhich is conditional onEnabled().
I didn't realize developer-notes.md says to only avoid relying on side-effects for LogDebug and LogTrace, I thought this was true for logging in general. I think it's very brittle that program execution potentially relies on side-effects in logging statements, and I think we should move away from that completely. A cursory search didn't show me any {Info,Warning,Error} statements that have such side-effects, but I agree that it's possible and that makes reviewing this PR much more difficult, so I'll revert that behaviour to minimize scope and leave room for a follow-up. Thanks for pointing this out.
They should be directly testable via Dispatcher, so that disabled debug/trace categories are trivial to test, without needing indirection.
I'm not sure I understand your concern here. They are testable via Dispatcher::WillLog(). Is it purely the indirection performance overhead you're worried about? When running the benches on main and on a4e8f3c (with small change to run this->Enabled() after this->WillLogCategoryLevel(static_cast<BCLog::LogFlags>(cat), level) which I'll include in next force-push, it doens't look like we have anything to worry about?
On master (34a5eca):
| ns/op | op/s | err% | total | benchmark
|--------------------:|--------------------:|--------:|----------:|:----------
| 1,837.04 | 544,353.35 | 2.6% | 0.01 | `LogWithDebug`
| 1,926.63 | 519,041.49 | 0.2% | 0.01 | `LogWithThreadNames`
| 2.81 | 355,468,404.18 | 3.1% | 0.01 | `LogWithoutDebug`
| 1,828.42 | 546,919.86 | 1.1% | 0.01 | `LogWithoutThreadNames`
| 18.31 | 54,600,726.18 | 1.1% | 0.01 | `LogWithoutWriteToFile`
On a4e8f3c (with small unpushed modification):
| ns/op | op/s | err% | total | benchmark
|--------------------:|--------------------:|--------:|----------:|:----------
| 1,826.40 | 547,524.98 | 0.7% | 0.01 | `LogWithDebug`
| 1,863.31 | 536,680.71 | 1.0% | 0.01 | `LogWithThreadNames`
| 4.23 | 236,235,773.60 | 0.4% | 0.01 | `LogWithoutDebug`
| 1,800.52 | 555,395.22 | 1.3% | 0.01 | `LogWithoutThreadNames`
| 19.92 | 50,188,600.76 | 1.0% | 0.01 | `LogWithoutWriteToFile`
There was a problem hiding this comment.
Latest force-push (dca56e0) reverts the unconditional argument evaluation for Info and higher levels, and adds a unit test to ensure behaviour before and after this PR remains the same. Thanks for raising this. Would be nice to remove logging side-effects entirely in a follow-up (probably just a documentation change and maybe some fix-ups).
There was a problem hiding this comment.
I didn't realize
developer-notes.mdsays to only avoid relying on side-effects forLogDebugandLogTrace, I thought this was true for logging in general. I think it's very brittle that program execution potentially relies on side-effects in logging statements,
Would be nice to remove logging side-effects entirely in a follow-up
The easiest way to ensure any potential/accidental side-effects don't make the behaviour brittle is to always evaluate the logging statements, which is what our code currently does -- it's similar logic as always evaluating assertions. Changing that is not a nice follow-up.
0d075e3 to
68681ee
Compare
Ensure all levels and categories that may be logged by kernel code are well defined. Necessary preparation for future work to move from string-based to struct-based logging.
Preparatory work for a future commit where the btck_LogCallback exposes btck_LogEntry instead of a string. Contains a lot of move-only changes, consider reviewing with --color-moved=dimmed-zebra
Replace the string-based logging callback with btck_LogEntry containing full metadata (message, source location, timestamp, level, category). This lets consumers format and filter logs as needed. Use the global log dispatcher instead of BCLog::Logger's buffered callback system. Messages are discarded when no callbacks are registered.
As logging is struct-based, there is no more need for the user to configure how logs are printed, so simplify the interface and remove dead code.
Logging categories are currently shared between node and kernel. This separation allows future commits to completely remove kernel's dependency on logging.h. Also applies clang-format suggestions to the moved code. Review with --color-moved=dimmed-zebra --color-moved-ws=ignore-all-space
For now, only logging.cpp implements LogAcceptCategory. In future commits, bitcoinkernel.cpp will implement its own levels-based functionality. Also add some test coverage on LogAcceptCategory.
Move logging macros to util/log.h so the entire codebase can use the same macros. Since (un)conditional logging is a node concept, it is no longer mentioned in the util macros. BCLog::Logger still maintains its logic to ensure the log level never drops below Info.
Preparation for a future commit where kernel's dependency on logging.cpp is removed completely. Replace usage of logging\.h with util/log\.h where it suffices, and fix wrong includes according to iwyu.
Because logging is now structured, bitcoinkernel users can implement their own filtering logic using the btck_LogEntry fields. Simplify the interface by replacing the category/levels setters with a single global btck_logging_set_min_level function. Implements a kernel-specific g_dispatcher with the levels-based filtering, so kernel's dependency on logging.cpp is now removed entirely.
a7a8728 to
ebe4409
Compare
|
re: #34374 (comment)
I implemented this in #34465 (after chatting with stickies in IRC). The changes in the first 4 commits were taken directly from this PR. Only the last commit adds a bit of new code. |
|
🐙 This pull request conflicts with the target branch and needs rebase. |
This is a minimal change allowing log generating functions in the new util/log.h header to attach timestamps to log messages alongside other log metadata like source location and thread name. Bigger upcoming changes in bitcoin#34374 update LogPrint functions to use a util::log::Entry struct instead of a list of parameters. This commit could go farther and make those changes now but is trying to stay mimimal and obvious.
37cc2a2 logging: use util/log.h where possible (stickies-v) bb8e9e7 logging: Move message formatting to util/log.h (stickies-v) 001f0a4 move-only: Move logging macros to util/log.h (stickies-v) 94c0adf move-onlyish: Move logging levels to util/log.h (stickies-v) 56d113c move-only: move logging categories to logging/categories.h (stickies-v) f5233f7 move-only: Move SourceLocation to util/log.h (stickies-v) Pull request description: This is a mostly move-only change. It's a small refactoring that allows logging macros to be used by including a simple `util/log.h` header instead of the full `logging.h` logging implementation. Most of the changes here were cherry-picked from #34374. Original motivation for this change was to reduce the size and complexity of #34374 (kernel structured logging PR) and reduce the number of conflicts it causes with other PRs. But this should also make sense as a standalone change to have a clearer separation of concerns between log generation and log handling, and avoid needing to depend on the whole logging framework in call sites that only emit log messages. Recommended to review with `--color-moved=dimmed-zebra --color-moved-ws=ignore-all-space` ACKs for top commit: l0rinc: diff ACK 37cc2a2 stickies-v: re-ACK 37cc2a2 optout21: crACK 37cc2a2 sedited: ACK 37cc2a2 Tree-SHA512: c7a761323ae63f07ad290d4e3766ba1348a132c8cc68a9895fa9ae5c89206599c00646c42ef77223ac757b9d8bfe6c181bead15e7058e4d8966b3bac88a8f950
test: verify log argument evaluation semantics Test that LogInfo/LogWarning/LogError always evaluate their arguments even when logging is disabled. ajtowns pointed out this behavior was important and could affect non-logging code if changed in bitcoin#34374 (comment)
Test that LogInfo/LogWarning/LogError always evaluate their arguments even when logging is disabled. ajtowns pointed out this behavior was important and could affect non-logging code if changed in bitcoin#34374 (comment)
Test that LogInfo/LogWarning/LogError always evaluate their arguments even when logging is disabled. ajtowns pointed out this behavior was important and could affect non-logging code if changed in bitcoin#34374 (comment)
Test that LogInfo/LogWarning/LogError always evaluate their arguments even when logging is disabled. ajtowns pointed out this behavior was important and could affect non-logging code if changed in bitcoin#34374 (comment)
Test that LogInfo/LogWarning/LogError always evaluate their arguments even when logging is disabled. ajtowns pointed out this behavior was important and could affect non-logging code if changed in bitcoin#34374 (comment) Co-authored-by: l0rinc <pap.lorinc@gmail.com>
|
Hi stickies, it'd be really nice to have this rebased & start review. Here is a rebase branch I made: https://github.com/ryanofsky/bitcoin/commits/review.34374.2-edit with minimal changes. There are more simplifications that can be made, but it's already much smaller and simpler after #34465 |
Test that LogInfo/LogWarning/LogError always evaluate their arguments even when logging is disabled. ajtowns pointed out this behavior was important and could affect non-logging code if changed in bitcoin#34374 (comment) Co-authored-by: l0rinc <pap.lorinc@gmail.com>
Rewrite log macros to fix a number of issues: lack of control over rate limiting, unnecessary strprintf calls during fuzzing, confusing error messages when macros are called with the wrong arguments. More specifically, benefits are: - Control over rate limiting: `LogPrintLevel_` hack used to bypass rate limiting is dropped. A `LogPrint` macro is added that can control rate limiting (as well as other options if they are added in the future). - Unnecessary `strprintf` calls are now skipped in bitcoind when `-noprinttoconsole -nodebuglogfile` options are used, and skipped in tests and kernel applications when logging is disabled. This change should not affect bitcoind noticeably, but it could speed up fuzz tests calling functions that log. - Clearer error messages: Differences between macros accepting and not accepting category arguments has been a source of confusion. Now if you pass a category to a macro which does not accept it, or forget to pass a category to a macro which requires it, you will see a clear error message telling you to add or remove the category instead of more confusing macro errors. - Previously you could call `LogPrintLevel_` to bypass restrictions on passing category arguments. Now `LogPrint` enforces all requirements. - Previously "always evaluate arguments" behavior at Info and higher levels looked accidental and was undocumented (bitcoin#34374 (comment)). Now the behavior is documented and explicit. - There is now a single macro, `LogPrint` implementing all logging functionality and enforcing all restrictions, instead of different sets of macros with branching code paths and unexplained differences. Co-Authored-By: Hodlinator <172445034+hodlinator@users.noreply.github.com> Co-Authored-By: stickies-v <stickies-v@protonmail.com>
tl;dr: kernel logging is cumbersome. This PR enables structured logging and a much simplified kernel logging interface. Closes #34062.
Motivation
The
bitcoinkernellibrary (#27587) exposes functionality to interface with the kernel logging. This includes registering callbacks for log statements, level/category filtering, string formatting options, and more.Kernel logging has a few problems:
btck_logging_set_level_categoryandbtck_logging_enable_categorywhen they want to producedebugortracelogs. The level/category system makes sense for node, because it directly controls what gets written to disk and stdout, and there are quite a lot more categories producing logs. Kernel doesn't really need this - users control what happens to the logs, and can do any filtering/manipulation in the callback they provide.Approach
Most of the logic in
logging.his not necessary forbitcoinkernel, so we first separate it intoutil/log.handlogging.h(common), then upgrade thebitcoinkernellogging interface, and finally remove thelogging.cppdependency from kernel entirely.util::log::Dispatcherthat is used to generate logs, and forwards structured log entries to registered callbacks. No formatting, rate-limiting, or output handling - that's the consumer's responsibility. Dispatcher can be configured with afilterpredicate (e.g.WillLogCategoryLevel()fornode, or a simpler levels-based filter forbitcoinkernel) to avoid unnecessary string formatting and other overhead.BCLog::Loggeron Dispatcher: Register a callback on the global Dispatcher to receive entries and handle node-specific concerns (formatting, rate-limiting, file output) without changing most of its logic.util/log.h: Allows the entire codebase (including kernel code) to use the same macros without depending onlogging.h.btck_LogEntrycontaining full metadata. Removebtck_LoggingOptionsand category enable/disable functions. Addbtck_logging_set_min_level()for early filtering.logging.cpptocommon: remove it as a kernel dependency entirely.Appendix
bitcoinkernelC logging interface