Conversation
cheme
left a comment
There was a problem hiding this comment.
Started looking. Generally it feels that the recorder and cache are less independent (I mean recorder act as a cache a bit).
I am not totally sure why the deterministic change is needed ( I imagine the usecase is to follow the actual proof size without calling extract_proof, seems more straight forward this way anyway).
Co-authored-by: cheme <emericchevalier.pro@gmail.com>
Co-authored-by: cheme <emericchevalier.pro@gmail.com>
Co-authored-by: David <dvdplm@gmail.com>
Co-authored-by: David <dvdplm@gmail.com>
Co-authored-by: David <dvdplm@gmail.com>
Co-authored-by: David <dvdplm@gmail.com>
Co-authored-by: Andronik <write@reusable.software>
Co-authored-by: Andronik <write@reusable.software>
|
@cheme I'm open to work on some improvements to this pr in some follow ups, to make some stuff more easier etc. However, I would really finally get to the point where this is merged. There has been a lot of discussion around this and also a lot of downstream projects would finally like to use a working cache in Substrate. There are also things like iteration that still don't use the cache, but that was done on purpose to come to some end. So, please give this again some final review to get this merged. |
|
@bkchr , what do you think of https://github.com/paritytech/trie/compare/bkchr-funny-branch...cheme:bkchr-funny-branch3?expand=1 like if we could avoid the new struct, it would be great. |
As already talked in person, I would like to move forward with this. There are several refactorings that could be done on top of this. Stuff like the proposed merging of That said, @cheme could you please give the final approval :) |
|
Will pass again on the full pr tomorrow first thing in the morning. |
Ty :) |
cheme
left a comment
There was a problem hiding this comment.
So the PR basically allow caching instantiating Node and add a key value cache.
Speed up against simple encoded trie node cache should come from not decoding multiple time the node.
Instantiated node being a internal trie structure, it looks indeed better to have this in trie crate (still a rather big update, so the speed improvment must be worth it I don't remember well the number). Note that this trie cache access is the same as node recording and I feel like both abstraction should be united somehow.
Having key value cache into trie crate is more discutable to me: my opinion is that it should be managed at substrate level. The thing it allows here is to weak ref over the trie node cache, but should still be doable by adding a new param in get_or_insert_node: an optional key when the node is containing a value (also optional hash is needed when hash is known). This way the trie cache could put the weak reference to the substrate key value cache.
There will be an issue for inline node: but in this case we can copy the value (eg just use this weak reference for value bigger than 32 byte) and only cache explicitely accessed content. Recorded key could still check by aaccessing the substrate key value cache.
But seems like we are at a point where it will be good to move forward with this PR and ignore concerns at first(assuming that the speed improvment over simple encoded trie node cache is worth it).
Certainly the current technical debt of the trie crate is quite high, the PR will make the situation worse. So I plan on creating the issue discussed (see draft end of this comment) in this review.
Wether we should push forward to implement them is a different question as this is rather a bit of work, and future direction could be:
- keep up with this trie crate
- split from ethereum
- remove rocksdb specific when it get no longer supported by substrate
- improve code (linked issues and possibly rewrite part of triedbmut and other suited refact).
- rewrite. Would not make sense until two first item from previous directions are not reached. So does not sound very useful (except if we want more radix support, but I remember it being fairly easilly implementable on current crate).
- look in other directions (I am thinking of radix tree index in parity db with merkle hash attachment).
I did spend a bit in the third direction previously (got good read speed improvement on some poc branch that only run a single state), can be a bit ambitious, so would probably be better in parallel to first direction (just maybe not investing too much on the code improvement in this case).
This PR is rather impacting so would be good to have a second review, I can only think of @arkpar.
Draft of issues to create related to these changes:
- Use as single in memory node representation
NodeOwned from triedbmut and OwnedNode are two representation of node in memory, making code very redundant and leading to unneeded type convertion.
A single representation should be use.
Branch https://github.com/cheme/trie/commits/bkchr-funny-branch3 starts implementing it but contains other unrelated changes and lack tests and polish, but can be a good starting point.
- Add cache to triedb iterator
Cache update as for triedbmut should be optional.
Additionally all hash-db accesses should be decorated with cache accesses and trie node recording in a systematic way.
- issue merge cache and recorder
Cache and recorder are a bit redundant, both are on top of hash-db access in implementation there is need to check recorder before accessing value cache.
Both should be under a same struct (branch starts it), or using a same trait.
Note that it may make sense to have a cache/recorder that do not write cache (eg with for triedbmut and substrate, these are basiaclly all accessed from previous read and not really needed to refresh cache, but still must have recorded).
Branch https://github.com/cheme/trie/commits/bkchr-funny-branch go into the trait direction, but using a single internal struct may be more suitable (no api changes).
- remove key value cache
key value cache could be manage outside of trie cache with a slightly less fine inline value caching.
- issue redundancy lookup.rs
after [issue merge cache and recorde] get solved, the code in lookup.rs could certainly be factored to avoid redundancy.
Simply always passing around the cache/recorder.
Also removing value cache as in [issue remove key value cache] will in itself remove part of the redundancy here.
Co-authored-by: cheme <emericchevalier.pro@gmail.com>
trie-db/src/node.rs
Outdated
| } | ||
| } | ||
|
|
||
| /// Returns the size in bytes of this node. |
| /// A record of a visited node. | ||
| #[cfg_attr(feature = "std", derive(Debug))] | ||
| #[derive(PartialEq, Eq, Clone)] | ||
| pub struct Record<HO> { |
There was a problem hiding this comment.
nit: I'd keep the struct. Code using it looks more readable
This pr introduces a trie level cache and recorder. The cache is mainly useful to speed up the access to the trie. The recorder inside the trie is required to be able to record proofs while having a cache. For a full reasoning see the following Substrate pr: paritytech/substrate#11407
The first thing this pr is doing is the introduction of the builder pattern for constructing a
TrieDBorTrieDBMutinstance. This makes it relative easy to construct one of these instances.The second big change in this pr was the introduction of the
NodeOwnedtype and all other "owned" types. These are required for the cache to keep the data in memory. Before theNodewas only operating on the slices of bytes, which resulted in constant decoding of data for every access to the trie. TheNodeOwnedtype decodes a node once and then keeps it in the memory. Alongside this type we also introduced a simple wrapper around bytes, theBytestype. It is basically aArc<[u8]>. With thisByteswe also added theBytesWeaktype. This is especially important for theCachedValuetype to hold the value without keeping a strong reference on it. ThisCachedValuetype is used to store values inside the cache.The third big change was the introduction of
get_hash. This function enables an user to get the hash of a value. Together with the cache it means that we can return this hash without needing to calculate it.The
TrieCacheandTrieRecorderare both traits that needs to be implemented by down stream users. For the trie recorder there exists one simple implementation inside thetrie-dbcrate. This implementation works and can be used, but for Substrate there exists a much more sophisticated implementation to provide better performance etc. As the recorder is now on the trie level, it can be that we missed some situations to record an access. However, for Substrate we support all required code paths (hopefully).The
TrieCacheis expected to storeNodeOwnedandCachedValue. To ensure that we don't waste memory and that theNodeOwnedis staying the owner of the values,CachedValueonly stores aBytesWeak. This enables a downstream implementation (like Substrate) to have a bounded cache implementation (bounded by maximum memory usage). The idea is that theNodeOwnedare in a LRU cache and when they are being evicted, the correspondingCachedValuecan still be in a different LRU cache, but not holding the entire value in memory.