Skip to content

[leap5] clear the DB dirty bit if pinnable_mapped_file fails to fully start#38

Merged
spoonincode merged 3 commits intoleap-5.0from
reset_dirty_leap5
Jan 24, 2024
Merged

[leap5] clear the DB dirty bit if pinnable_mapped_file fails to fully start#38
spoonincode merged 3 commits intoleap-5.0from
reset_dirty_leap5

Conversation

@spoonincode
Copy link
Contributor

Clear the DB dirty bit if pinnable_mapped_file throws during ctor after it has set the bit to dirty. Resolves AntelopeIO/leap#2086


template<typename F>
struct scope_fail {
scope_fail(F&& f) : _f{static_cast<F&&>(f)}, _exception_count{std::uncaught_exceptions()} {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realize you just copied this here, but why does it use static_cast instead of std::move ? Also seems like we should delete the copy constructor and assignement operator.

@greg7mdp
Copy link
Contributor

I feel it would be much simpler to not set the dirty bit at all in the constructor except when we are in "mapped" mode.

@greg7mdp
Copy link
Contributor

I don't think this implementation would work for mapped_private, as it would clear the dirty bit in the private mapping, so it would not change the file on disk.
We would need to do _file_mapped_region = bip::mapped_region(); before calling set_mapped_file_db_dirty(false);

@spoonincode
Copy link
Contributor Author

I feel it would be much simpler to not set the dirty bit at all in the constructor except when we are in "mapped" mode.

I think that alone introduces a new defect for the other modes because currently dirty isn't set when writing out data. So you'd need to do it then. Regardless, I am not sure the upside of this since it doesn't do anything more helpful for the user anyways, for example,

cp shared_memory.bin shared_memory.bin.orig
nodeos for a bit then SIGKILL it
cp shared_memory.bin.orig shared_memory.bin
restart nodeos: No existing fork database despite existing chain state. Replay required.

So the least complex approach is to just be consistent and blanket do it in a simple way: dirty in ctor, clean in dtor. Current complexity comes about more so due to the code structure and approach imo.

I don't think this implementation would work for mapped_private, as it would clear the dirty bit in the private mapping, so it would not change the file on disk. We would need to do _file_mapped_region = bip::mapped_region(); before calling set_mapped_file_db_dirty(false);

There are some unit tests added that show this working for the defect being resolved.

It's not pretty but what seems to happen is that when in writable mapped_private mode,

_file_mapped_region = bip::mapped_region(); // delete old r/w mapping before creating new one
setup_copy_on_write_mapping();

and then,
_file_mapped_region = bip::mapped_region(_file_mapping, bip::copy_on_write);

But it's on this line that the exception will be thrown so _file_mapped_region is left 'empty'. So when,
void pinnable_mapped_file::set_mapped_file_db_dirty(bool dirty) {
assert(_writable);
if (_file_mapped_region.get_address() == nullptr)
_file_mapped_region = bip::mapped_region(_file_mapping, bip::read_write, 0, _db_size_multiple_requirement);

The mapping to the real disk file is recreated before setting the dirty bit.

Of course, some other exception after line 236 and the end of the ctor would cause the problem you're describing, but I guess currently that isn't possible since setup_copy_on_write_mapping() is effectively the end of the ctor for mapped_private. Still, my impression is that it's probably safe to add the _file_mapped_region = bip::mapped_region(); to account for future.

@greg7mdp
Copy link
Contributor

greg7mdp commented Jan 17, 2024

Yes, I agree the issue I raised is not valid and the existing code is fine (discounting the complexity)
Also thanks for explaining the fork_db issue, I had missed that.

I wonder why we remove the fork_db disk file after loading it (I would suggest to leave it except when chainbase is in mapped mode)? it seems inconsistent with what we are doing with chainbase, and if we didn't, my suggestion would work and improve crash recovery I think.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

When mapped_private doesn't work, database is left dirty

3 participants