Conversation
This generalises the functionality introduced for PCP/NAT-PMP logging in bitcoin#34549.
This mostly reduces log noise by using BCLog::NET debug logging where possible (with `LogWarnThenDebug()` used to ensure the first log message still appears for things that may be due to system misconfiguration). Also drops some unnecessary trailing newlines.
|
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. ReviewsSee the guideline for information on the review process.
If your review is incorrectly listed, please copy-paste ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
|
See #30348 for further background; I think the changes here should be able to stand on their own merits. I haven't addressed cases where there's ambiguity between LogWarning and LogError here, but if there are other current cases where things are at Warning/Error and should be Info, or at Warning/Error/Info and should be Debug, happy to add them to this PR. |
kevkevinpal
left a comment
There was a problem hiding this comment.
ACK c665935
This is a good improvement on the amount of functional code dedicated to logging. Also, the LogWarnThenDebug log will be useful in future cases to help avoid log noise
Added two comments about potentially adding BCLog::VALIDATION to the two validation logs touched
ryanofsky
left a comment
There was a problem hiding this comment.
Code review ACK c665935, Concept -0. Changes here seem logical overall, and I left comments the few places where I had concerns. The LogWarnThenDebug macro also seems cool and potentially useful more places.
The thing I don’t like about this change is that it makes it harder to see if messages are errors or warnings or normal notifications. Being able to see [error] and [warning] tags is a nice feature that is unfortunate to lose. 1
In an ideal world I would use LogWarning and LogError macros more places rather than fewer. I would also make LogWarning and LogError messages configurable and not always shown, and add a LogCritical level for truly critical messages that should always be shown even when debug logging is off.
That said, this PR seems fine overall: some positives, some negatives, and the code changes look ok.
Footnotes
-
Note that a message’s error/warning/info status is orthogonal to its priority. A message could describe a normal condition (info), an unexpected condition (warning), or a failure condition (error), and any of these messages could have high or low priority depending how important the information is, whether the condition is temporary or permanent, what side effects it could lead to, and whether it might require manual intervention. ↩
| LogDebug(BCLog::NET, "SOCKS5 connecting %s\n", strDest); | ||
| if (strDest.size() > 255) { | ||
| LogError("Hostname too long\n"); | ||
| LogDebug(BCLog::PROXY, "Hostname too long for SOCKS5 proxying: %s", strDest); |
There was a problem hiding this comment.
In commit "netbase: Tidy up logging levels" (436bc94)
Maybe prefix this with Error: to indicate a problem has occurred and distinguish this from informational debug messages
Would suggest the same for other some messages below that aren't clearly labeled as error or failures.
| uint8_t pchRet1[2]; | ||
| if (InterruptibleRecv(pchRet1, 2, g_socks5_recv_timeout, sock) != IntrRecvError::OK) { | ||
| LogInfo("Socks5() connect to %s:%d failed: InterruptibleRecv() timeout or other failure\n", strDest, port); | ||
| LogDebug(BCLog::PROXY, "Socks5() connect to %s:%d failed: InterruptibleRecv() timeout or other failure", strDest, port); |
There was a problem hiding this comment.
In commit "netbase: Reduce levels of socks5 error logging" (7567471)
If this is an error communicating with the proxy server (as opposed to an error with the proxy server making a remote connection), it would seem potentially important for the user to know about so they can fix the problem with the server. Hiding this could make problems harder to diagnose, and it might be better to keep the previous level
| // Set the non-blocking option on the socket. | ||
| if (!sock->SetNonBlocking()) { | ||
| LogInfo("Error setting socket to non-blocking: %s\n", NetworkErrorString(WSAGetLastError())); | ||
| LogWarnThenDebug(BCLog::NET, "Error setting socket to non-blocking: %s", NetworkErrorString(WSAGetLastError())); |
There was a problem hiding this comment.
In commit "netbase: Tidy up logging levels" (436bc94)
This change seems ok, but if the goal in the commit message is to use higher levels for errors "that may be due to system misconfiguration" I don't know why this socket error would be more likely to be the result of misconfiguration than SO_NOSIGPIPE above, or if something else account for the difference like this error being considered more severe?
There was a problem hiding this comment.
The SO_NOSIGPIPE error is marked as "continuing anyway", rather than immediately failing the connection, which seems fairly straightforwardly less severe.
| CScript redeemScript(rsData.begin(), rsData.end()); | ||
| tempKeystore.AddCScript(redeemScript); | ||
| if (!tempKeystore.AddCScript(redeemScript)) { | ||
| throw std::runtime_error(strprintf("Error adding redeemscript=%s", HexStr(redeemScript))); |
There was a problem hiding this comment.
In commit "script: Lower level of oversized redeemScript log, push errors to callers" (c02d589)
It seems good to throw an error here though text of this message doesn't seem ideal. Redeemscript blob is potentially large and earler logged message seems more helpful (though I'm not sure if logs are shown in bitcoin-tx so it might also be useless). Might want to add another TODO comment here to improve
| auto sock = CreateSock(AF_UNIX, SOCK_STREAM, 0); | ||
| if (!sock) { | ||
| LogError("Cannot create a socket for connecting to %s\n", m_unix_socket_path); | ||
| LogWarnThenDebug(BCLog::PROXY, "Cannot create a socket for connecting to %s", m_unix_socket_path); |
There was a problem hiding this comment.
In commit "netbase: Reduce levels of socks5 error logging" (7567471)
I’m not sure LogWarnThenDebug() is the best fit here. Compared to the earlier uses in this commit and the previous one, this seems less like a persistent configuration problem and more like something that could be intermittent, e.g. due to hitting resource limits.
In that case, warning exactly once and then may hide recurring failures. It would seem better to have a rate limit and show the error once per hour or once per day while it is still occurring, not to just show it once and never again for the lifetime of the process. Absent that, keeping this at info/warning/error level might be safer.
| auto sock = CreateSock(dest.GetSAFamily(), SOCK_STREAM, IPPROTO_TCP); | ||
| if (!sock) { | ||
| LogError("Cannot create a socket for connecting to %s\n", dest.ToStringAddrPort()); | ||
| LogDebug(BCLog::NET, "Cannot create a socket for connecting to %s", dest.ToStringAddrPort()); |
There was a problem hiding this comment.
In commit 436bc94
I'm a bit confused when you are choosing Debug and when WarnThenDebug in this commit. What is the difference between these cases?
There was a problem hiding this comment.
Proxy::Connect failing means you can't connect to the proxy you've configured, which seems likely to be a fairly serious problem for your node, and that error is likely the thing you need to see; ConnectDirectly failing might also be a severe problem, but it might also just be that you're trying to connect over ipv6 because your node says "why not?" but your OS says "no way", which is not really problematic at all. Note that CreateSock is CreateSockOS which mostly has LogWarnThenDebug before it returns nullptr as well.
Not particularly wedded to any particular choices here, these were just my best guesses.
There was a problem hiding this comment.
Thanks for elaborating. Makes sense to me.
This set of changes reduces some of the exaggerated logging that was introduced in #29236 when "error()" logging was upgraded from
LogPrintf(akaLogInfo) toLogError. It also introducesLogWarnThenDebug()which duplicates the behaviour of #34549 which issues a warning the first time the log line is hit, but downgrades that to categorized debug messages afterwards. This function is used for failures that are likely caused by system configuration problems (so likely degrade the node's functionality and warrant admin attention) but are triggered by remote activity and thus could occur frequently.