ARROW-12986: [C++][Gandiva] Implement new cache eviction policy algorithm in Gandiva by jpedroantunes · Pull Request #10465 · apache/arrow

jpedroantunes · 2021-06-07T12:52:04Z

This PR replaces the LRU based cache for gandiva with a new cache which takes into account the LLVM build time along with the LRU factor.
Here is a description of the suggested algorithm:

// A particular cache based on the GreedyDual-Size cache which is a generalization of LRU
// which defines costs for each cache values.
// The algorithm associates a cost, C, with each cache value. Initially, when the value 
// is brought into cache, C is set to be the cost related to the value (the cost is 
// always non-negative). When a replacement needs to be made, the value with the lowest C
// cost is replaced, and then all values reduce their C costs by the minimum value of C 
// over all the values already in the cache. 
// If a value is accessed, its C value is restored to its initial cost. Thus, the C costs 
// of recently accessed values retain a larger portion of the original cost than those of
// values that have not been accessed for a long time. The C costs are reduced as time 
// goes and are restored when accessed.

More info here

jpedroantunes · 2021-06-17T14:20:22Z

@augustoasilva can you review, please?

augustoasilva

For me it is almost all good, just review this points that I've marked and evaluate them.

cpp/src/gandiva/greedy_dual_size_cache_test.cc

cpp/src/gandiva/lru_cache_test.cc

cpp/src/gandiva/base_cache.h

cpp/src/gandiva/cache.h

cpp/src/gandiva/greedy_dual_size_cache.h

cpp/src/gandiva/projector.cc

cpp/src/gandiva/greedy_dual_size_cache.h

github-actions · 2021-07-12T12:08:23Z

https://issues.apache.org/jira/browse/ARROW-12986

…eys for compatibility with the new cache

…he logic

…edy-dual-size algorithm

…ject

…lues

cpp/src/gandiva/projector.cc

cpp/src/gandiva/greedy_dual_size_cache.h

…ithm in Gandiva This PR replaces the LRU based cache for gandiva with a new cache which takes into account the LLVM build time along with the LRU factor. Here is a description of the suggested algorithm: ``` // A particular cache based on the GreedyDual-Size cache which is a generalization of LRU // which defines costs for each cache values. // The algorithm associates a cost, C, with each cache value. Initially, when the value // is brought into cache, C is set to be the cost related to the value (the cost is // always non-negative). When a replacement needs to be made, the value with the lowest C // cost is replaced, and then all values reduce their C costs by the minimum value of C // over all the values already in the cache. // If a value is accessed, its C value is restored to its initial cost. Thus, the C costs // of recently accessed values retain a larger portion of the original cost than those of // values that have not been accessed for a long time. The C costs are reduced as time // goes and are restored when accessed. ``` More info [here](https://www.usenix.org/legacy/publications/library/proceedings/usits97/full_papers/cao/cao_html/node8.html) Closes apache#10465 from jpedroantunes/feature/change-cache-policy and squashes the following commits: 02a7998 <João Pedro> Add todo for overflow handling for correctness a62f2d4 <João Pedro> Add overflow handler for cache algorithm 540ac76 <João Pedro> Remove unused constructor for ValueCacheObject 58c64a9 <João Pedro> Apply linter corrections 213b74a <João Pedro> Apply corrections on the greedydualsize cache definition 4ab1bf1 <João Pedro> Remove base cache header file not used anymore e66bff9 <João Pedro> Remove lru cache and change to use the greedy dual by default 6136f7c <João Pedro> Add string variations on cache tests 0d25678 <João Pedro> Correct linter errors e008f8c <João Pedro> Add identation to PriorityItem class e2c38a9 <João Pedro> Correct linter errors 08f1bd6 <João Pedro> Change cache implementation to handle special structure objects as values d9ef056 <João Pedro> Change cache main abstraction to consider the usage of a ValueCacheObject 8658b1e <João Pedro> Remove unused getCacheType function a45adb9 <João Pedro> Change BaseCache insert method to receive only key and value 9e46ce3 <João Pedro> Remove unused operator< implementation from cache keys dabea56 <João Pedro> Remove unused operator< implementation from cache keys 88af686 <João Pedro> Add base logic for gd-size algorithm implementation on the new cache 3b3746e <João Pedro> Rename the created cache files to consider the new approach using greedy-dual-size algorithm abed895 <João Pedro> Apply corrections and optimization on new cache classes a8b0bd8 <João Pedro> Change lvu cache for not using unnecessarily a pair as value 3c013c8 <João Pedro> Change cache logic to use unique_ptr 8871018 <João Pedro> Fix wrong use of u_long to use uint64 instead ddd90fc <João Pedro> Correct missing identation on gandiva cache files a04726a <João Pedro> Change for not using u_long and use u_int64 on cache 10a7016 <João Pedro> Fix lint problems found on new cache documents on CI builds f4874f6 <João Pedro> Fix lint problems on all added files bcec5fc <João Pedro> Add logic for calculating the llvm build time to be considered on cache logic 253c826 <João Pedro> Adapt lru cache test for using the new insertion method definition cd49c23 <João Pedro> Remove unused method definition from base cache class 0bee1a5 <João Pedro> Add inheritance definition as public on cache child classes 6763be6 <João Pedro> Add cache method for defining order parameter 24ec647 <João Pedro> Change cache class to handle with BaseClass pointer e31d922 <João Pedro> Change base cache methods to be virtual b8ff8b8 <João Pedro> Add base cache file as attemp to generalize caches c32f90e <João Pedro> Add operator< implementation necessary for filter and project cache keys for compatibility with the new cache 1ece64d <João Pedro> Add new cache unit test file to the project CMakeLists.txt 8ebf667 <João Pedro> Add unit test for the new cache implementation a08832a <João Pedro> Add implementation for cache based on a lower value policy Authored-by: João Pedro <joaop@simbioseventures.com> Signed-off-by: Praveen <praveen@dremio.com> (cherry picked from commit a628ee0)

jpedroantunes changed the title ~~ARROW-12986: [C++][Gandiva] Implement new cache eviction policy in Gandiva~~ ARROW-12986: [C++][Gandiva] Implement new cache eviction policy algorithm in Gandiva Jun 7, 2021

jpedroantunes force-pushed the feature/change-cache-policy branch from e1bb1fc to d94618d Compare June 8, 2021 12:23

github-actions bot added Component: Gandiva Component: C++ labels Jun 8, 2021

jpedroantunes force-pushed the feature/change-cache-policy branch 2 times, most recently from fa13d02 to 92f5984 Compare June 15, 2021 11:32

augustoasilva suggested changes Jun 23, 2021

View reviewed changes

cpp/src/gandiva/greedy_dual_size_cache_test.cc Outdated Show resolved Hide resolved

cpp/src/gandiva/lru_cache_test.cc Outdated Show resolved Hide resolved

jpedroantunes force-pushed the feature/change-cache-policy branch from 92f5984 to e1911e7 Compare June 23, 2021 11:07

projjal reviewed Jun 25, 2021

View reviewed changes

cpp/src/gandiva/base_cache.h Outdated Show resolved Hide resolved

jpedroantunes force-pushed the feature/change-cache-policy branch 2 times, most recently from 35a3812 to 1083f91 Compare June 30, 2021 00:49

jpedroantunes force-pushed the feature/change-cache-policy branch from 1083f91 to 92708bc Compare July 6, 2021 22:58

projjal reviewed Jul 12, 2021

View reviewed changes

jpedroantunes added 15 commits July 12, 2021 09:08

Add implementation for cache based on a lower value policy

a08832a

Add unit test for the new cache implementation

8ebf667

Add new cache unit test file to the project CMakeLists.txt

1ece64d

Add operator< implementation necessary for filter and project cache k…

c32f90e

…eys for compatibility with the new cache

Add base cache file as attemp to generalize caches

b8ff8b8

Change base cache methods to be virtual

e31d922

Change cache class to handle with BaseClass pointer

24ec647

Add cache method for defining order parameter

6763be6

Add inheritance definition as public on cache child classes

0bee1a5

Remove unused method definition from base cache class

cd49c23

Adapt lru cache test for using the new insertion method definition

253c826

Add logic for calculating the llvm build time to be considered on cac…

bcec5fc

…he logic

Fix lint problems on all added files

f4874f6

Fix lint problems found on new cache documents on CI builds

10a7016

Change for not using u_long and use u_int64 on cache

a04726a

jpedroantunes added 16 commits July 12, 2021 09:08

Apply corrections and optimization on new cache classes

abed895

Rename the created cache files to consider the new approach using gre…

3b3746e

…edy-dual-size algorithm

Add base logic for gd-size algorithm implementation on the new cache

88af686

Remove unused operator< implementation from cache keys

dabea56

Remove unused operator< implementation from cache keys

9e46ce3

Change BaseCache insert method to receive only key and value

a45adb9

Remove unused getCacheType function

8658b1e

Change cache main abstraction to consider the usage of a ValueCacheOb…

d9ef056

…ject

Change cache implementation to handle special structure objects as va…

08f1bd6

…lues

Correct linter errors

e2c38a9

Add identation to PriorityItem class

e008f8c

Correct linter errors

0d25678

Add string variations on cache tests

6136f7c

Remove lru cache and change to use the greedy dual by default

e66bff9

Remove base cache header file not used anymore

4ab1bf1

Apply corrections on the greedydualsize cache definition

213b74a

jpedroantunes force-pushed the feature/change-cache-policy branch from 92708bc to 213b74a Compare July 12, 2021 12:09

Apply linter corrections

58c64a9

projjal reviewed Jul 12, 2021

View reviewed changes

cpp/src/gandiva/projector.cc Outdated Show resolved Hide resolved

cpp/src/gandiva/greedy_dual_size_cache.h Outdated Show resolved Hide resolved

cpp/src/gandiva/greedy_dual_size_cache.h Show resolved Hide resolved

jpedroantunes added 2 commits July 12, 2021 21:56

Remove unused constructor for ValueCacheObject

540ac76

Add overflow handler for cache algorithm

a62f2d4

projjal reviewed Jul 13, 2021

View reviewed changes

cpp/src/gandiva/greedy_dual_size_cache.h Outdated Show resolved Hide resolved

projjal reviewed Jul 13, 2021

View reviewed changes

cpp/src/gandiva/greedy_dual_size_cache.h Outdated Show resolved Hide resolved

Add todo for overflow handling for correctness

02a7998

projjal approved these changes Jul 13, 2021

View reviewed changes

praveenbingo closed this in a628ee0 Jul 15, 2021

anthonylouisbsb deleted the feature/change-cache-policy branch January 27, 2022 22:45

asfimport mentioned this pull request Jul 21, 2021

[C++][Gandiva] Implement new cache eviction policy in Gandiva #28704

Closed

niyue mentioned this pull request Feb 13, 2024

GH-40040: [C++][Gandiva] Make Gandiva's default cache size to be 5000 for object code cache #40041

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ARROW-12986: [C++][Gandiva] Implement new cache eviction policy algorithm in Gandiva#10465

ARROW-12986: [C++][Gandiva] Implement new cache eviction policy algorithm in Gandiva#10465
jpedroantunes wants to merge 39 commits intoapache:masterfrom
s1mbi0se:feature/change-cache-policy

jpedroantunes commented Jun 7, 2021 •

edited

Loading

Uh oh!

jpedroantunes commented Jun 17, 2021

Uh oh!

augustoasilva left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Jul 12, 2021

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

jpedroantunes commented Jun 7, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jpedroantunes commented Jun 17, 2021

Uh oh!

augustoasilva left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Jul 12, 2021

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

jpedroantunes commented Jun 7, 2021 •

edited

Loading