feat: add --cache_mb and --cache_percentage flag#6271
Conversation
jarifibrahim
left a comment
There was a problem hiding this comment.
Got some comments.
Reviewable status: 0 of 11 files reviewed, 12 unresolved discussions (waiting on @manishrjain, @martinmr, @NamanJain8, and @vvbalaji-dgraph)
go.mod, line 9 at r1 (raw file):
contrib.go.opencensus.io/exporter/prometheus v0.1.0 github.com/99designs/gqlgen v0.12.2 github.com/AndreasBriese/bbloom v0.0.0-20190825152654-46b345b51c96 // indirect
Run go mod tidy and it should remove this.
dgraph/cmd/alpha/run.go, line 209 at r1 (raw file):
// Add cache flags flag.Int64("cache_mb", 0, "Total size of cache (in MB) to be used in dgraph")
to be used in alpha
dgraph/cmd/alpha/run.go, line 210 at r1 (raw file):
// Add cache flags flag.Int64("cache_mb", 0, "Total size of cache (in MB) to be used in dgraph") flag.String("cache-percentage", "30:30:20:20", "Cache percentages for various caches (pblocksize:pindexsize:windexsize:postingListsize)")
100 characters.
cache-percentage => cache_percentage.
Add a detailed explanation about the ratios. pblocksize would make no sense to the user.
dgraph/cmd/alpha/run.go, line 592 at r1 (raw file):
x.AssertTruef(totalCache >= 0, "ERROR: Cache size must be non-negative") if len(cp) != 4 { log.Fatalf("ERROR: cache percentage format is pblocksize:pindexsize:windexsize:postingListsize")
Here as well. Instead of calling it pblockSize let's call it PstoreBlockCache and PstoreIndexCache , etc. We want user to be able to understand what these flags actually mean.
dgraph/cmd/alpha/run.go, line 610 at r1 (raw file):
if percentSum != 100 { log.Fatalf("ERROR: cache percentage does not sum up to 100")
Add the percentages here. w+x+y+z is not equal to 100.
dgraph/cmd/alpha/run.go, line 615 at r1 (raw file):
pIndexSize := (cachePercent[1] * (totalCache << 20)) / 100 wIndexSize := (cachePercent[2] * (totalCache << 20)) / 100 postingListSize := (cachePercent[3] * (totalCache << 20)) / 100
All these are cache sizes. Let's call them cache sizes.
dgraph/cmd/debug/run.go, line 787 at r1 (raw file):
} x.Check(err) posting.Init(db, 0)
Add a comment // Not using posting list cache
dgraph/cmd/zero/run.go, line 112 at r1 (raw file):
// Add cache flags flag.Int64("cache_mb", 0, "Total size of cache (in MB) to be used in dgraph")
to be used in zero
dgraph/cmd/zero/run.go, line 113 at r1 (raw file):
// Add cache flags flag.Int64("cache_mb", 0, "Total size of cache (in MB) to be used in dgraph") flag.String("cache-percentage", "75:25", "Cache percentages for various caches (blockcache:indexCache)")
cache_percentage
Zero doesn't have an index cache right now. We should set this to 100:0right now. We don't need caches always.
dgraph/cmd/zero/run.go, line 230 at r1 (raw file):
} cp := strings.Split(opts.cachePercentage, ":")
You could move the code below to x.go file (or if we have a util file in x). The code is duplicated.
dgraph/cmd/zero/run.go, line 242 at r1 (raw file):
x, err := strconv.Atoi(percent) if err != nil { log.Fatalf("ERROR: ERROR: unable to parse cache percentages")
Print the percent here.
dgraph/cmd/zero/run.go, line 245 at r1 (raw file):
} if x < 0 { log.Fatalf("ERROR: cache percentage cannot be negative")
Print the percent here as well.
manishrjain
left a comment
There was a problem hiding this comment.
Got a few comments. Get @jarifibrahim for final look.
Reviewed 2 of 11 files at r1, 10 of 10 files at r2.
Reviewable status: all files reviewed, 17 unresolved discussions (waiting on @jarifibrahim, @martinmr, @NamanJain8, and @vvbalaji-dgraph)
dgraph/cmd/alpha/run.go, line 210 at r2 (raw file):
flag.Int64("cache_mb", 0, "Total size of cache (in MB) to be used in alpha") flag.String("cache_percentage", "30:30:20:20", `Cache percentages for various caches (PstoreBlockCache:PstoreIndexCache:PstoreIndexCache:PostingListCache)`)
Order it in the biggest to smallest expected caches.
posting list cache : p store block cache : p store index cache : w store block cache : w store index cache
Use commas.
0,65,25,0,10
dgraph/cmd/zero/run.go, line 113 at r1 (raw file):
Previously, jarifibrahim (Ibrahim Jarif) wrote…
cache_percentage
Zero doesn't have an index cache right now. We should set this to
100:0right now. We don't need caches always.
100 chars. Use commas.
x/x.go, line 1106 at r2 (raw file):
// GetCachePercentages returns the slice of cache percentages given the ":" (colon) separated // cache percentages(integers) string and format of cpString. func GetCachePercentages(cpString string, format string) []int64 {
numExpected int
don't need format.
x/x.go, line 1107 at r2 (raw file):
// cache percentages(integers) string and format of cpString. func GetCachePercentages(cpString string, format string) []int64 { cp := strings.Split(cpString, ":")
comma.
x/x.go, line 1109 at r2 (raw file):
cp := strings.Split(cpString, ":") // Sanity checks if len(cp) != len(strings.Split(format, ":")) {
numExpected int
x/x.go, line 1118 at r2 (raw file):
x, err := strconv.Atoi(percent) if err != nil { log.Fatalf("ERROR: ERROR: unable to parse cache percentage(%s)", percent)
don't need to do fatals here. Return the error and let the caller deal with this.
NamanJain8
left a comment
There was a problem hiding this comment.
Reviewable status: 7 of 12 files reviewed, 17 unresolved discussions (waiting on @jarifibrahim, @manishrjain, @martinmr, and @vvbalaji-dgraph)
go.mod, line 9 at r1 (raw file):
Previously, jarifibrahim (Ibrahim Jarif) wrote…
Run
go mod tidyand it should remove this.
Done.
dgraph/cmd/alpha/run.go, line 209 at r1 (raw file):
Previously, jarifibrahim (Ibrahim Jarif) wrote…
to be used in alpha
Done.
dgraph/cmd/alpha/run.go, line 210 at r1 (raw file):
Previously, jarifibrahim (Ibrahim Jarif) wrote…
100 characters.
cache-percentage => cache_percentage.
Add a detailed explanation about the ratios.
pblocksizewould make no sense to the user.
Done.
dgraph/cmd/alpha/run.go, line 592 at r1 (raw file):
Previously, jarifibrahim (Ibrahim Jarif) wrote…
Here as well. Instead of calling it
pblockSizelet's call itPstoreBlockCacheandPstoreIndexCache, etc. We want user to be able to understand what these flags actually mean.
Done.
dgraph/cmd/alpha/run.go, line 610 at r1 (raw file):
Previously, jarifibrahim (Ibrahim Jarif) wrote…
Add the percentages here. w+x+y+z is not equal to 100.
Done.
dgraph/cmd/alpha/run.go, line 615 at r1 (raw file):
Previously, jarifibrahim (Ibrahim Jarif) wrote…
All these are cache sizes. Let's call them cache sizes.
Done.
dgraph/cmd/alpha/run.go, line 210 at r2 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
Order it in the biggest to smallest expected caches.
posting list cache : p store block cache : p store index cache : w store block cache : w store index cache
Use commas.
0,65,25,0,10
Done.
dgraph/cmd/debug/run.go, line 787 at r1 (raw file):
Previously, jarifibrahim (Ibrahim Jarif) wrote…
Add a comment
// Not using posting list cache
Done.
dgraph/cmd/zero/run.go, line 112 at r1 (raw file):
Previously, jarifibrahim (Ibrahim Jarif) wrote…
to be used in zero
Done.
dgraph/cmd/zero/run.go, line 113 at r1 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
100 chars. Use commas.
Done.
dgraph/cmd/zero/run.go, line 230 at r1 (raw file):
Previously, jarifibrahim (Ibrahim Jarif) wrote…
You could move the code below to
x.gofile (or if we have a util file inx). The code is duplicated.
Done.
dgraph/cmd/zero/run.go, line 242 at r1 (raw file):
Previously, jarifibrahim (Ibrahim Jarif) wrote…
Print the
percenthere.
Done.
dgraph/cmd/zero/run.go, line 245 at r1 (raw file):
Previously, jarifibrahim (Ibrahim Jarif) wrote…
Print the
percenthere as well.
Done.
x/x.go, line 1106 at r2 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
numExpected int
don't need format.
Done.
x/x.go, line 1107 at r2 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
comma.
Done.
x/x.go, line 1109 at r2 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
numExpected int
Done.
x/x.go, line 1118 at r2 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
don't need to do fatals here. Return the error and let the caller deal with this.
Done.
jarifibrahim
left a comment
There was a problem hiding this comment.
Reviewable status: 4 of 12 files reviewed, 10 unresolved discussions (waiting on @jarifibrahim, @manishrjain, @martinmr, @NamanJain8, and @vvbalaji-dgraph)
dgraph/cmd/bulk/reduce.go, line 133 at r3 (raw file):
opt := badger.DefaultOptions(dir).WithSyncWrites(false). WithTableLoadingMode(bo.MemoryMap).WithValueThreshold(1 << 10 /* 1 KB */). WithLogger(nil).WithBlockCacheSize(1 << 20).
This isn't related to this PR but this cache is too small.
worker/config.go, line 68 at r3 (raw file):
// PIndexCacheSize is the size of index cache for pstore PIndexCacheSize int64 // PBlockCacheSize is the size of block cache for wstore
wblockcachesize
worker/config.go, line 70 at r3 (raw file):
// PBlockCacheSize is the size of block cache for wstore WBlockCacheSize int64 // PIndexCacheSize is the size of index cache for wstore
windexcachesize
x/x.go, line 1109 at r3 (raw file):
// Sanity checks if len(cp) != numExpected { return nil, errors.Errorf("ERROR: expected %d cache percentages, got %d", numExpected, len(cp))
100 chars
NamanJain8
left a comment
There was a problem hiding this comment.
Reviewable status: 4 of 12 files reviewed, 10 unresolved discussions (waiting on @jarifibrahim, @manishrjain, @martinmr, @NamanJain8, and @vvbalaji-dgraph)
worker/config.go, line 68 at r3 (raw file):
Previously, jarifibrahim (Ibrahim Jarif) wrote…
wblockcachesize
Done.
worker/config.go, line 70 at r3 (raw file):
Previously, jarifibrahim (Ibrahim Jarif) wrote…
windexcachesize
Done.
x/x.go, line 1109 at r3 (raw file):
Previously, jarifibrahim (Ibrahim Jarif) wrote…
100 chars
Done.
parasssh
left a comment
There was a problem hiding this comment.
minor comments. also address comments we discussed on zoom.
LGTM otherwise.
| opt := badger.DefaultOptions(dir).WithSyncWrites(false). | ||
| WithTableLoadingMode(bo.MemoryMap).WithValueThreshold(1 << 10 /* 1 KB */). | ||
| WithLogger(nil).WithMaxCacheSize(1 << 20). | ||
| WithLogger(nil).WithBlockCacheSize(1 << 20). |
There was a problem hiding this comment.
shouldn't we use the flag configured instead of 1 << 20 ?
There was a problem hiding this comment.
I think we should. That would require adding a flag in dgraph bulk. @jarifibrahim What do you say?
There was a problem hiding this comment.
Bulk loader shouldn't even have a cache. I'll create a ticket for this. This Pr doesn't need to block on this.
There was a problem hiding this comment.
I see that those tmpDBs are being read. We need a cache here but it has to be configurable. I have created DGRAPH-2352
This PR closes DGRAPH-2344.
Badger now has 2 separate caches blockCache and indexCache (see dgraph-io/badger#1476).
This PR adds --cache_mb and --cache_percentage flags for alpha and zero. The
total cache is split among various caches used by zero and alpha based on
percentages defined. Cache size is in MBs and format of caches is as follows:
For alpha:
PostingListCache,PstoreBlockCache,PstoreIndexCache,WstoreBlockCache,WstoreIndexCache
For zero:
blockCache,indexCache
This change is