ARROW-12986: [C++][Gandiva] Implement new cache eviction policy algorithm in Gandiva#10465
Closed
jpedroantunes wants to merge 39 commits intoapache:masterfrom
Closed
ARROW-12986: [C++][Gandiva] Implement new cache eviction policy algorithm in Gandiva#10465jpedroantunes wants to merge 39 commits intoapache:masterfrom
jpedroantunes wants to merge 39 commits intoapache:masterfrom
Conversation
e1bb1fc to
d94618d
Compare
fa13d02 to
92f5984
Compare
Contributor
Author
|
@augustoasilva can you review, please? |
augustoasilva
suggested changes
Jun 23, 2021
Contributor
augustoasilva
left a comment
There was a problem hiding this comment.
For me it is almost all good, just review this points that I've marked and evaluate them.
92f5984 to
e1911e7
Compare
projjal
reviewed
Jun 25, 2021
35a3812 to
1083f91
Compare
1083f91 to
92708bc
Compare
projjal
reviewed
Jul 12, 2021
…eys for compatibility with the new cache
…edy-dual-size algorithm
92708bc to
213b74a
Compare
projjal
reviewed
Jul 12, 2021
projjal
reviewed
Jul 13, 2021
projjal
reviewed
Jul 13, 2021
projjal
approved these changes
Jul 13, 2021
jvictorhuguenin
pushed a commit
to s1mbi0se/arrow
that referenced
this pull request
Sep 21, 2021
…ithm in Gandiva This PR replaces the LRU based cache for gandiva with a new cache which takes into account the LLVM build time along with the LRU factor. Here is a description of the suggested algorithm: ``` // A particular cache based on the GreedyDual-Size cache which is a generalization of LRU // which defines costs for each cache values. // The algorithm associates a cost, C, with each cache value. Initially, when the value // is brought into cache, C is set to be the cost related to the value (the cost is // always non-negative). When a replacement needs to be made, the value with the lowest C // cost is replaced, and then all values reduce their C costs by the minimum value of C // over all the values already in the cache. // If a value is accessed, its C value is restored to its initial cost. Thus, the C costs // of recently accessed values retain a larger portion of the original cost than those of // values that have not been accessed for a long time. The C costs are reduced as time // goes and are restored when accessed. ``` More info [here](https://www.usenix.org/legacy/publications/library/proceedings/usits97/full_papers/cao/cao_html/node8.html) Closes apache#10465 from jpedroantunes/feature/change-cache-policy and squashes the following commits: 02a7998 <João Pedro> Add todo for overflow handling for correctness a62f2d4 <João Pedro> Add overflow handler for cache algorithm 540ac76 <João Pedro> Remove unused constructor for ValueCacheObject 58c64a9 <João Pedro> Apply linter corrections 213b74a <João Pedro> Apply corrections on the greedydualsize cache definition 4ab1bf1 <João Pedro> Remove base cache header file not used anymore e66bff9 <João Pedro> Remove lru cache and change to use the greedy dual by default 6136f7c <João Pedro> Add string variations on cache tests 0d25678 <João Pedro> Correct linter errors e008f8c <João Pedro> Add identation to PriorityItem class e2c38a9 <João Pedro> Correct linter errors 08f1bd6 <João Pedro> Change cache implementation to handle special structure objects as values d9ef056 <João Pedro> Change cache main abstraction to consider the usage of a ValueCacheObject 8658b1e <João Pedro> Remove unused getCacheType function a45adb9 <João Pedro> Change BaseCache insert method to receive only key and value 9e46ce3 <João Pedro> Remove unused operator< implementation from cache keys dabea56 <João Pedro> Remove unused operator< implementation from cache keys 88af686 <João Pedro> Add base logic for gd-size algorithm implementation on the new cache 3b3746e <João Pedro> Rename the created cache files to consider the new approach using greedy-dual-size algorithm abed895 <João Pedro> Apply corrections and optimization on new cache classes a8b0bd8 <João Pedro> Change lvu cache for not using unnecessarily a pair as value 3c013c8 <João Pedro> Change cache logic to use unique_ptr 8871018 <João Pedro> Fix wrong use of u_long to use uint64 instead ddd90fc <João Pedro> Correct missing identation on gandiva cache files a04726a <João Pedro> Change for not using u_long and use u_int64 on cache 10a7016 <João Pedro> Fix lint problems found on new cache documents on CI builds f4874f6 <João Pedro> Fix lint problems on all added files bcec5fc <João Pedro> Add logic for calculating the llvm build time to be considered on cache logic 253c826 <João Pedro> Adapt lru cache test for using the new insertion method definition cd49c23 <João Pedro> Remove unused method definition from base cache class 0bee1a5 <João Pedro> Add inheritance definition as public on cache child classes 6763be6 <João Pedro> Add cache method for defining order parameter 24ec647 <João Pedro> Change cache class to handle with BaseClass pointer e31d922 <João Pedro> Change base cache methods to be virtual b8ff8b8 <João Pedro> Add base cache file as attemp to generalize caches c32f90e <João Pedro> Add operator< implementation necessary for filter and project cache keys for compatibility with the new cache 1ece64d <João Pedro> Add new cache unit test file to the project CMakeLists.txt 8ebf667 <João Pedro> Add unit test for the new cache implementation a08832a <João Pedro> Add implementation for cache based on a lower value policy Authored-by: João Pedro <joaop@simbioseventures.com> Signed-off-by: Praveen <praveen@dremio.com> (cherry picked from commit a628ee0)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR replaces the LRU based cache for gandiva with a new cache which takes into account the LLVM build time along with the LRU factor.
Here is a description of the suggested algorithm:
More info here