Skip to content

Conversation

@hebasto
Copy link
Member

@hebasto hebasto commented May 5, 2020

Tracking our instrumented mutexes (Mutex and RecursiveMutex types) requires that all involved objects should not be destroyed until after their last use. On master (ec79b5f) we have two problems related to the object destroying order:

  • the function-local static lockdata object that is destroyed at program exit
  • the thread_local g_lockstack that is destroyed at thread exit

Both cases could cause UB at program exit in so far as mutexes are used in other static object destructors.

Fix #18824

@hebasto
Copy link
Member Author

hebasto commented May 5, 2020

cc @sipa

@DrahtBot DrahtBot added the Docs label May 5, 2020
Copy link
Member

@maflcko maflcko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK

@hebasto
Copy link
Member Author

hebasto commented May 5, 2020

Updated 5a406f7 -> 7e3f40d (pr18881.01 -> pr18881.02, diff):

not sure why auto needs to hide both the type and the pointer-ness. The additional 5 chars don't seem overly verbose.

@hebasto
Copy link
Member Author

hebasto commented May 5, 2020

Updated 7e3f40d -> a223407 (pr18881.02 -> pr18881.03, diff):

approach guarantees

Copy link
Member

@jonatack jonatack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK/ACK

@hebasto
Copy link
Member Author

hebasto commented May 5, 2020

Updated a223407 -> d3c2686 (pr18881.03 -> pr18881.04, diff):

@DrahtBot DrahtBot added the Docs label May 5, 2020
Copy link
Member

@jonatack jonatack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Iterative re-review (apologies)

@hebasto
Copy link
Member Author

hebasto commented May 5, 2020

Updated d3c2686 -> 86ce03c (pr18881.04 -> pr18881.05, diff):

Copy link
Member

@jonatack jonatack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK


Both cases lead to undefined behavior.

To prevent the static initialization order problem, be sure to Construct On First Use:
Copy link
Member

@laanwj laanwj May 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you should make this suggestion in general. This solution, while valuable in rare cases such as logging and tracking locks, is a brute-force one that does not belong in actual application code. It's messy and assumes we don't actually know what's going on (in regard to initialization dependencies) or what are all the edge cases.

You're supposed to expliclty allocate things in a deterministic order during initialization and destroy it in a (usually reversed) deterministic order on shutdown in the appropriate functions in init.cpp, not rely on global initialization order nor first use.

@laanwj
Copy link
Member

laanwj commented May 7, 2020

ACK on using this solution in this specific place but NACK on suggesting its use in general.

@jonatack
Copy link
Member

jonatack commented May 7, 2020

looked more deeply into this to try to understand it better

I didn't find a direct reference to a "COFU" or "Construct on First Use" idiom in any of these books. The best references online seemed to be:

and I agree with not suggesting it in general.

@hebasto
Copy link
Member Author

hebasto commented May 7, 2020

Updated 86ce03c -> 6bc6868 (pr18881.05 -> pr18881.06, diff):

  • only the fix of UB in the DeleteLock() function remains

@maflcko
Copy link
Member

maflcko commented May 7, 2020

ACK 6bc6868

Copy link
Member

@jonatack jonatack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Light code review ACK 6bc6868, this commit appears to be an implementation of the solution in https://isocpp.org/wiki/faq/ctors#static-init-order-on-first-use-members

@hebasto
Copy link
Member Author

hebasto commented May 7, 2020

Light code review ACK 6bc6868, this commit appears to be an implementation of the solution in https://isocpp.org/wiki/faq/ctors#static-init-order-on-first-use-members

To be exact: https://isocpp.org/wiki/faq/ctors#static-init-order-on-first-use

@jonatack
Copy link
Member

jonatack commented May 7, 2020

@maflcko maflcko added Refactoring and removed Docs labels May 12, 2020
@maflcko
Copy link
Member

maflcko commented May 19, 2020

As we are not going to destroy LockData instance, this requirement is not relevant

Why not? Assuming ~LockData had side effects, then not calling the destructor is going to miss those side effects.

@DrahtBot
Copy link
Contributor

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

@maflcko
Copy link
Member

maflcko commented May 19, 2020

And if this was somehow "incorrect" then LogInstance should also be adjusted to be consistent

@hebasto
Copy link
Member Author

hebasto commented May 20, 2020

I'll try to elaborate my opinion :)

// This is only safe to do because the *lock_data object and its subobjects happen to have trivial destructors.

@MarcoFalke

What do you mean with "technically incorrect"? https://en.cppreference.com/w/cpp/language/destructor#Trivial_destructor

  1. I mean that among LockData type and its subtypes only using LockPair = std::pair<void*, void*>; and std::mutex have trivial desctructors. This could be easily verified with the std::is_trivially_destructible type trait.

An implicitly-defined destructor, i.e., generated by the compiler, is not the same as a trivial one.

  1. Agree that if we are going do not destruct a heap-allocated object until the program termination, it is pointless to provide a user-defined destructor with any side effects. If this statement is required to be added to the code as a comment, that comment should refer to "not having an explicit desctructor" or "having a implicit desctructor", but not to "a trivial desctructor".

And if this was somehow "incorrect" then LogInstance should also be adjusted to be consistent

The only difference is that the function-local static g_logger is a raw pointer that, in theory, should be deleted.

In this PR the function-local static lock_data is a reference, and this approach explicitly shows our intention do not delete the heap-allocated object at all.

If the above arguments are convincing for our community, I agree that making g_logger a reference is good for consistency.

@maflcko
Copy link
Member

maflcko commented May 22, 2020

If this statement is required to be added to the code as a comment, that comment should refer to "not having an explicit desctructor" or "having a implicit desctructor", but not to "a trivial desctructor".

The statement has been added to the logger, so I don't see why it shouldn't be added here.

reference, and this approach explicitly shows our intention do not delete the heap-allocated object at all.

I wasn't aware that this style is used to indicate that delete will not be applied, but fine with me. Functionally * and & are identical.

#include <iostream>

struct A {
  A() { std::cout << __func__ << std::endl; }
  ~A() { std::cout << __func__ << std::endl; }
};

int main() {
  std::cout << "start " << __func__ << std::endl;
  static A a;
  static A *b = new A();
  static A &c = *new A();
  std::cout << "end   " << __func__ << std::endl;
}
start main
A
A
A
end   main
~A

@maflcko
Copy link
Member

maflcko commented May 22, 2020

re-ACK 26c093a only change since last review is removing undefined behaviour 🚗

Show signature and timestamp

Signature:

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

re-ACK 26c093a9957756f3743c2347fe0abd90f81159c4 only change since last review is removing undefined behaviour 🚗
-----BEGIN PGP SIGNATURE-----

iQGzBAEBCgAdFiEE+rVPoUahrI9sLGYTzit1aX5ppUgFAlwqrYAACgkQzit1aX5p
pUifmwwAs+llsO+ADQ05xZMwCoY0YQ5Zi4w65QVdoRIq9rrajfvDOSyJT09/p2d/
SiJN1SbB1PWFpxW0xPpT9CfAs1AhSnMGtEtm7xsllv5m/X+lsDPHqgsbN0DcfUeE
Q3CnDFklxI7JUOJkcHusmRB9ETxWRJPQ0Hx79OGImUbubdB32rLdsSqfw4oraFzn
bUY1TV76e+IB37zbYiCwiTr1EUOBu1DUV7FBr6AycYRrKMWXO2lPAGqlrOD+Orwo
fyWA2RAcVUbItktWDfy2D9zB8m2DT/HUYr/3DY7P7J6SotirPYQAbaHQO51K4NAo
PAAss/uwkX1zpXhP3hxgLBtKvixOTvZANHf6ch1dSvRrz3XyY0xCT0Z4XuL7dCJk
aGE6AcLzer5h46dfM2Bu3exFSRSaSvqLYks0Jv+b50m9U25+2W7XvJ4IhPovPlpY
/DiOQwMX1DteOd4h9PHSFl6BTLr8eDzPnNDpTue4EzQxXV7BtG++d3nyI1Uhp8OX
VVDT2gg9
=UHOn
-----END PGP SIGNATURE-----

Timestamp of file with hash 27744a6864db90d25a36be70b128da2a91330c24db6afd49a7dd4c0d2c53111c -

@hebasto
Copy link
Member Author

hebasto commented May 22, 2020

@MarcoFalke

If this statement is required to be added to the code as a comment, that comment should refer to "not having an explicit desctructor" or "having a implicit desctructor", but not to "a trivial desctructor".

The statement has been added to the logger, so I don't see why it shouldn't be added here.

bitcoin/src/logging.cpp

Lines 25 to 26 in b5c423c

* Since the destructor is never called, the logger and all its members must
* have a trivial destructor.

That is not the case in so far as the Logger class has no trivial destructor. One could check this with a patch:

--- a/src/logging.cpp
+++ b/src/logging.cpp
@@ -8,6 +8,8 @@
 #include <util/time.h>
 
 #include <mutex>
+#include <type_traits>
+static_assert(std::is_trivially_destructible<BCLog::Logger>::value, "Logger has no trivial ctor.");
 
 const char * const DEFAULT_DEBUGLOGFILE = "debug.log";

and compile with an error:

logging.cpp:12:1: error: static assertion failed: Logger has no trivial ctor.

static LockData lockdata;
return lockdata;
// This approach guarantees that the object is not destroyed until after its last use.
// The operating system automatically reclaims all the memory in a program's heap when that program exits.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// The operating system automatically reclaims all the memory in a program's heap when that program exits.
// The operating system automatically reclaims all the memory in a program's heap when that program exits.
// Since the destructor is never called, the object and all its members must have an implicit destructor.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hebasto
Copy link
Member Author

hebasto commented May 22, 2020

Updated 26c093a -> 90eb027 (pr18881.14 -> pr18881.15, diff):

@maflcko
Copy link
Member

maflcko commented May 22, 2020

re-ACK 90eb027, only change is new doc commit 👠

Show signature and timestamp

Signature:

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

re-ACK 90eb027204, only change is new doc commit 👠
-----BEGIN PGP SIGNATURE-----

iQGzBAEBCgAdFiEE+rVPoUahrI9sLGYTzit1aX5ppUgFAlwqrYAACgkQzit1aX5p
pUjBJQv8D4l/8DutA799ti+iOrxhVDJh48fWd71nn2WKqcyVFJ9Y6G18v2ASlNOz
SrVnXhIUv/8jWgT5kOlqNw+UBMSLxNvv5NvYkchjQNYvHe5UfMrS8kWSqC6BYubc
UuPt+g6wV/Y+I9umkuH46Pk26iwqmVTJh86WbA1syjclNTCm1PEfs6tEbxB5bn7+
fBJQBQaV4J886hy1tRGK1FDNGQ93pNg88fgHdyoVMFsA4kP6utK/E93WshfV238i
Q9d3jiZSVzX5rZSId3Z3cYJMhGcXqC76ZT8tl+v3IX7ErS4Rf+AFyp813L/eWpDD
mHkImUprvIUtNkkZkjBPfAFeKWWG0gsLcfGU4/e+TAZ7R4vIm4v6fVmcunY5WEbE
64Qm5nq7LYOeOXi8XvvHug2l3zS67ZoTeEtGJhEiF0+rcqTgsv/bACr8FksBheu9
SZxUADkbahNnUv+22sEd+Nj2k50dGJOev/H50od1/gvt0E8vydUSz/P+g9NfXJyo
uiKec1Pw
=oyNZ
-----END PGP SIGNATURE-----

Timestamp of file with hash 5124182a0b09070ce2424a477f0c8ebc1bc2c756ae24de2a66e6948ffd673c33 -

@maflcko
Copy link
Member

maflcko commented May 22, 2020

Would be nice to get this merged soon, so that I don't have to pull in this branch every time for fuzzing.

Copy link
Contributor

@ryanofsky ryanofsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK +0.5. The DeleteLock fix makes sense and I really like the code cleanups, but I don't understand the thread_local change.

Code review ACK 90eb027 because all the changes look correct and safe. But I don't know the purpose of commit 26c093a "Replace thread_local g_lockstack with a mutex-protected map (5/6)." It seems like it could have a bad impact on debug performance, and the commit message and PR description don't give a reason for the change.

re: PR Description #18881 (comment)

This PR:

  • fixes #18824
  • adds Construct On First Use idiom documentation

Related to #15233.

Would suggest updating the PR description to:

  • Summarize what the UB behavior is so it isn't necessary to read a different github issue
  • Give a rationale for replacing the thread_local in the last commit
  • Drop the construct idiom text if it's no longer relevant
  • Link to #18824 and #15233 at the bottom, and clarify if only the first commit is needed to fix bug #18824, or if the thread_local commit is needed too

@maflcko
Copy link
Member

maflcko commented May 22, 2020

Several cases of UB are fixed here.

@ryanofsky
Copy link
Contributor

Several cases of UB are fixed here.

Thanks, I definitely think the PR description needs to be improved here and summarize what it is changing and give a clue about how the thread_local change fixes the second problem. If we don't know why the thread local change fixes it, the PR description should say that and not try to be mysterious

@hebasto
Copy link
Member Author

hebasto commented May 22, 2020

@ryanofsky

Would suggest updating the PR description to:

Thank you for review. The PR description has been updated.
Sometimes it seems the C++ is much easier to express ideas than English :)

@ryanofsky
Copy link
Contributor

re-ACK. Thanks for updating the PR description, this looks good to me. I'm pretty sure there should be a more efficient way to stop referencing the thread_local after it is destroyed (perhaps just resetting the reference in the thread_local destructor) without replacing the thread_local with a map with frequently added & removed entries. But this only affects debugging and probably isn't a big deal.

@maflcko
Copy link
Member

maflcko commented May 22, 2020

I am planning to merge this next week unless there are objections or outstanding action items

@ryanofsky
Copy link
Contributor

I am planning to merge this next week unless there are objections or outstanding action items

👍 and seems fine to merge earlier, too. None of this code is even compiled unless debugging is turned on.

@maflcko maflcko merged commit fe1357a into bitcoin:master May 26, 2020
@hebasto hebasto deleted the 200505-cofu branch May 26, 2020 12:52
sidhujag pushed a commit to syscoin/syscoin that referenced this pull request May 27, 2020
90eb027 doc: Add and fix comments about never destroyed objects (Hennadii Stepanov)
26c093a Replace thread_local g_lockstack with a mutex-protected map (Hennadii Stepanov)
58e6881 refactor: Refactor duplicated code into LockHeld() (Hennadii Stepanov)
f511f61 refactor: Add LockPair type alias (Hennadii Stepanov)
8d8921a refactor: Add LockStackItem type alias (Hennadii Stepanov)
458992b Prevent UB in DeleteLock() function (Hennadii Stepanov)

Pull request description:

  Tracking our instrumented mutexes (`Mutex` and `RecursiveMutex` types) requires that all involved objects should not be destroyed until after their last use. On master (ec79b5f) we have two problems related to the object destroying order:
  - the function-local `static` `lockdata` object that is destroyed at [program exit](https://en.cppreference.com/w/cpp/utility/program/exit)
  - the `thread_local` `g_lockstack` that is destroyed at [thread exit](https://en.cppreference.com/w/cpp/language/destructor)

  Both cases could cause UB at program exit in so far as mutexes are used in other static object destructors.

  Fix bitcoin#18824

ACKs for top commit:
  MarcoFalke:
    re-ACK 90eb027, only change is new doc commit 👠
  ryanofsky:
    Code review ACK 90eb027 because all the changes look correct and safe. But I don't know the purpose of commit  26c093a "Replace thread_local g_lockstack with a mutex-protected map (5/6)." It seems like it could have a bad impact on debug performance, and the commit message and PR description don't give a reason for the change.

Tree-SHA512: 99f29157fd1278994e3f6eebccedfd9dae540450f5f8b980518345a89d56b635f943a85b20864cef087027fd0fcdb4880b659ef59bfe5626d110452ae22031c6
jasonbcox pushed a commit to Bitcoin-ABC/bitcoin-abc that referenced this pull request Sep 22, 2020
Summary:
 * refactor: Add LockStackItem type alias

 * refactor: Add LockPair type alias

 * refactor: Refactor duplicated code into LockHeld()

 * Replace thread_local g_lockstack with a mutex-protected map

This change prevents UB in case of early g_lockstack destroying.

Co-authored-by: Wladimir J. van der Laan <laanwj@protonmail.com>

 * doc: Add and fix comments about never destroyed objects

This is a backport of Core [[bitcoin/bitcoin#18881 | PR18881]]

Test Plan:
  ninja all check-all

Also in debug mode.

Reviewers: #bitcoin_abc, majcosta

Reviewed By: #bitcoin_abc, majcosta

Subscribers: majcosta

Differential Revision: https://reviews.bitcoinabc.org/D7525
PastaPastaPasta pushed a commit to PastaPastaPasta/dash that referenced this pull request Oct 12, 2021
pravblockc pushed a commit to pravblockc/dash that referenced this pull request Nov 18, 2021
@bitcoin bitcoin locked as resolved and limited conversation to collaborators Feb 15, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Possible UB in DeleteLock() function

8 participants