Skip to content

More optimizations to avoid calling Index.Lookup()#1549

Merged
fd0 merged 6 commits intorestic:masterfrom
MJDSys:more_index_lookup_avoids
Jan 24, 2018
Merged

More optimizations to avoid calling Index.Lookup()#1549
fd0 merged 6 commits intorestic:masterfrom
MJDSys:more_index_lookup_avoids

Conversation

@MJDSys
Copy link
Copy Markdown
Contributor

@MJDSys MJDSys commented Jan 12, 2018

What is the purpose of this change? What does it change?

Similar to #1538, avoid calling Index.Lookup in a couple more cases. This doesn't have the same performance benefit as #1538, but it does reduce cpu usage some on my desktop.

The first commit changes the way MasterIndex preforms lookups in its child indices, by first checking if the index has the given blob. On repositories with large number of indices (my two backups have 445 and 763 each), the creation of the error for each attempted lookup easily shows up in the cpu profile. I don't think this change has any practical threading related issues, even though the lock for the index is lost and regained between the Has and Lookup call.

The second commit changes the archiver to use Has over Lookup. The archiver was using Lookup as a Has function call, so this was a straightforward change. I don't have a test/benchmark for it, as the function is unexposed and I wasn't sure how you wanted to handle that case. Should I make a separate test file (archiver_internal_test.go) for it, or just use the Save function and benchmark that? If a test/benchmark is wanted, let me know which way you prefer and I can whip one up.

Was the change discussed in an issue or in the forum before?

No

Checklist

  • I have read the Contribution Guidelines
  • I have added tests for all changes in this PR
  • I have added documentation for the changes (in the manual)
  • There's a new file in a subdir of changelog/x.y.z that describe the changes for our users (template here)
  • I have run gofmt on the code in all commits
  • All commit messages are formatted in the same style as the other commits in the repo
  • I'm done, this Pull Request is ready for review

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change won't cause any issues, since blobs are only added to an index, so it cannot happen that Has() succeeds and subsequent Lookup() fails. But now you have the map lookup twice.

I think we can revert this change and do better: Since the only error Lookup() ever returns is not found, how about we change the function signature like like this:

func (idx *Index) Lookup(id restic.ID, tpe restic.BlobType) (blobs []restic.PackedBlob, found bool) {

That would mean touching every place where Lookup() is called, but shouldn't be too hard to do.

Now the lookup is done only once, and there's no error built in any case.

What do you think?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me, I'll take a look into doing that this weekend. Hopefully I'll have some good news/better benchmarks!

Copy link
Copy Markdown
Member

@fd0 fd0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And I'd strongly suggest installing the fmt-check pre-commit hook from here: https://github.com/edsrzf/gofmt-git-hook

Helped me a lot :)

@MJDSys MJDSys force-pushed the more_index_lookup_avoids branch from 3e9c9a5 to 4baa151 Compare January 15, 2018 22:15
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exported method ID.DebugStr should have comment or be unexported

@MJDSys MJDSys force-pushed the more_index_lookup_avoids branch 3 times, most recently from 65530c5 to 27bfdb4 Compare January 15, 2018 22:19
@MJDSys
Copy link
Copy Markdown
Contributor Author

MJDSys commented Jan 15, 2018

Hey @fd0,

I've implemented your suggestion. For the most part, the performance didn't change. However, I did notice with your suggestion that the debug logging doubled the time taken and dramatically increased the allocations, even when disabled. I tracked it down the the ID.Str() function, which allocates a string on call even if not used.

I made a second version that delays the calculation until used, for use with debug printing. I replaced the version used in Lookup, which helped the micro benchmark. I didn't see it show up in the backup profiling, so I'm not sure how much you want to take that change.

I originally tried replacing Str with DebugStr, but it broke various for loops in the frontend. If you wish, I can go back to that design and audit/test all those loops. But I'm not sure the performance advantage will beat out the risk of breakage (though it should be relatively visible). Alternately, I could just put those statements in a if statement, and have that be used only when the binary is compiled for debugging. Let me know what you prefer.

I also tried that script to check formatting, but it appears to not work well with my git habits. Thanks for pointing it out though!

@codecov-io
Copy link
Copy Markdown

codecov-io commented Jan 16, 2018

Codecov Report

Merging #1549 into master will decrease coverage by 5.25%.
The diff coverage is 48.07%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master   #1549      +/-   ##
=========================================
- Coverage   51.55%   46.3%   -5.26%     
=========================================
  Files         138     138              
  Lines       11081   11085       +4     
=========================================
- Hits         5713    5133     -580     
- Misses       4503    5141     +638     
+ Partials      865     811      -54
Impacted Files Coverage Δ
cmd/restic/cmd_diff.go 2.15% <0%> (ø) ⬆️
internal/fuse/file.go 67.01% <0%> (-0.7%) ⬇️
cmd/restic/cmd_dump.go 6.38% <0%> (ø) ⬆️
cmd/restic/cmd_cat.go 1.86% <0%> (ø) ⬆️
internal/archiver/archiver.go 64.95% <100%> (+0.13%) ⬆️
internal/repository/repository.go 56.3% <27.27%> (ø) ⬆️
internal/restic/node.go 40.93% <33.33%> (ø) ⬆️
internal/restic/id.go 71.92% <50%> (-1.66%) ⬇️
internal/repository/index.go 67.82% <80%> (ø) ⬆️
internal/repository/master_index.go 45.86% <90.9%> (ø) ⬆️
... and 7 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 02f4f5d...27bfdb4. Read the comment docs.

@fd0
Copy link
Copy Markdown
Member

fd0 commented Jan 18, 2018

However, I did notice with your suggestion that the debug logging doubled the time taken and dramatically increased the allocations, even when disabled. I tracked it down the the ID.Str() function, which allocates a string on call even if not used.

Hm, is that behavior new with the changes you made? That shouldn't be the case in my opinion, it probably was already present before.

Is it also there when compiled without the debug tag? If it isn't I don't really care, as the debug logs will also cost performance in other places. If it is also present without the debug tag, then remove the debug messages and move them up the call stack to some higher function (or just remove them, we'll add them back if needed).

@MJDSys
Copy link
Copy Markdown
Contributor Author

MJDSys commented Jan 18, 2018

Ugh, my sleepy brain wasn't clear with what was happening. So the changes in this PR (with or without the Id.DebugStr change) are faster then the previous version. However, when I changed the Lookup function to return a boolean, that caused the benchmarks to slow down slightly (but still generally faster then the previous version). The difference was that Index.Lookup has logging, which slowed down the not found case compared to the Has function which has no logging. Also this is all without the debug tag, so the rest of the logging is disabled.

I'll work to remove those log calls, but insert a few statements in some useful cases up the call stack. I just want to avoid these calls in the usual case, as you still pay the price.

Just to be clear, the reason this is an issue is that when Str() is called, there is some cpu time/memory allocations to return the string immediately. This happens regardless of whether or not the logging framework is enabled or does anything. I think currently that overhead is small enough to generally ignore, but in a very hot path there is some measurable overhead. I brought it up here since it showed up in my profiling.

@fd0
Copy link
Copy Markdown
Member

fd0 commented Jan 19, 2018

However, when I changed the Lookup function to return a boolean, that caused the benchmarks to slow down slightly (but still generally faster then the previous version).

Thanks for the explanation. Can you show me the benchmarks?

I've thrown together a quick benchmark for debug.Log():

$ go test -v -bench . -benchmem
goos: linux
goarch: amd64
pkg: github.com/restic/restic/internal/debug
BenchmarkLogStatic-4     	2000000000	         1.77 ns/op	       0 B/op	       0 allocs/op
BenchmarkLogIDString-4   	300000000	         5.58 ns/op	       0 B/op	       0 allocs/op
BenchmarkLogIDStr-4      	100000000	        12.5 ns/op	       0 B/op	       0 allocs/op
PASS
ok  	github.com/restic/restic/internal/debug	7.242s

$ go test -v -tags debug -bench . -benchmem

debug enabled
goos: linux
goarch: amd64
pkg: github.com/restic/restic/internal/debug
BenchmarkLogStatic-4     	  200000	      8567 ns/op	     456 B/op	      20 allocs/op
BenchmarkLogIDString-4   	  200000	      8626 ns/op	     504 B/op	      21 allocs/op
BenchmarkLogIDStr-4      	  200000	      8644 ns/op	     488 B/op	      21 allocs/op
PASS
ok  	github.com/restic/restic/internal/debug	5.441s

We can see that the debug logging costs something, but there aren't any allocations when the debug tag isn't set. And since it's just a few nanoseconds (without debug) I wonder if it's worth optimizing further.

I've also tried changing debug.Log() to use reflection to call Str() if it is available (so it's really only executed when debug is active), then the cost of the debug.Log() call is about 6ns instead of 13ns for non-debug calls.

If you find good places up the call stack that's fine, otherwise just throw the calls to debug.Log() away for now.

@MJDSys MJDSys force-pushed the more_index_lookup_avoids branch from 27bfdb4 to d5b0fa6 Compare January 22, 2018 04:27
@MJDSys
Copy link
Copy Markdown
Contributor Author

MJDSys commented Jan 22, 2018

Thanks for the explanation. Can you show me the benchmarks?
I added a commit containing some benchmarks. The results I get are noted below, and match about what I expect considering the speeds I got with my commits dealing with the ID.Str. I'm not sure why you're benchmarks report no difference between the three, that seems really odd to me. I do see the allocations and time associated when calling with .Str, and see a huge difference when using the debug tag.

If you don't want these benchmarks to stay with the code, please feel free to just pop the commit off this branch and before merging.

$ go test -run xx -bench . -benchmem -v
goos: linux
goarch: amd64
pkg: github.com/restic/restic/internal/debug
BenchmarkLogStatic-6            1000000000               2.41 ns/op            0 B/op          0 allocs/op
BenchmarkLogIDStr-6             20000000                94.7 ns/op            16 B/op          2 allocs/op
BenchmarkLogIDString-6          10000000               219 ns/op             128 B/op          2 allocs/op
PASS
ok      github.com/restic/restic/internal/debug 7.089s
$ go test -run xx -bench . -benchmem -v -tags debug
debug enabled
goos: linux
goarch: amd64
pkg: github.com/restic/restic/internal/debug
BenchmarkLogStatic-6              100000             12826 ns/op             432 B/op         20 allocs/op
BenchmarkLogIDStr-6               100000             14069 ns/op             440 B/op         23 allocs/op
BenchmarkLogIDString-6            100000             14058 ns/op             552 B/op         23 allocs/op
PASS
ok      github.com/restic/restic/internal/debug 4.523s

I also took a quick look at places where .Lookup is called (just doing a quick manual check with grep). Any interesting calls seem to already have a Log call if needed, so I just removed the calls instead. I also removed the whole commit adding DebugStr from this PR.

@fd0
Copy link
Copy Markdown
Member

fd0 commented Jan 23, 2018

Nah, leave the benchmarks in. Can you please fix the code, there seems to be something wrong when run on Travis? Maybe you need to rebase on the latest master branch?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: "Enhancement". That'll also fail on Travis when calens is run to generate the changelog file:

$ calens
file changelog/0.8.2/pull-1549: entry type "Enchancement" is invalid, valid types: map[Enhancement:4 Security:1 Bugfix:2 Change:3]

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, the title is too long, max is 80 chars. I'd suggest removing the second "speed up":

Enhancement: Speed up querying across indices and scanning existing files

@fd0
Copy link
Copy Markdown
Member

fd0 commented Jan 23, 2018

Ah, in master, the errors package was removed from several files, especially in fuse/, but this code reintroduces it, so the merged commit does not compile.

I suggest rebasing on the master branch, and then running goimports -w on the files.

When setting up the index used for benchmarking, use math/rand instead of
crypto/rand since the generated ids don't need to be evenly distributed,
and not be secure against guessing.  As such, use a different random id
function (only available during tests) that uses math/rand instead.
When looking up a blob in the master index, with several
indexes present in the master index, a significant amount of time
is spent generating errors for each failed lookup.  However, these
errors are often used to check if a blob is present, but the contents
are not inspected making the overhead of the error not useful.

Instead, change Index.Lookup (and Index.LookupSize) to instead return
a boolean denoting if the blob was found instead of an error.  Also change
all the calls to these functions to handle the new function signature.

benchmark                                            old ns/op     new ns/op     delta
BenchmarkMasterIndexLookupSingleIndex-6              820           897           +9.39%
BenchmarkMasterIndexLookupMultipleIndex-6            12821         2001          -84.39%
BenchmarkMasterIndexLookupSingleIndexUnknown-6       5378          492           -90.85%
BenchmarkMasterIndexLookupMultipleIndexUnknown-6     17026         1649          -90.31%

benchmark                                            old allocs     new allocs     delta
BenchmarkMasterIndexLookupSingleIndex-6              9              9              +0.00%
BenchmarkMasterIndexLookupMultipleIndex-6            59             19             -67.80%
BenchmarkMasterIndexLookupSingleIndexUnknown-6       22             6              -72.73%
BenchmarkMasterIndexLookupMultipleIndexUnknown-6     72             16             -77.78%

benchmark                                            old bytes     new bytes     delta
BenchmarkMasterIndexLookupSingleIndex-6              160           160           +0.00%
BenchmarkMasterIndexLookupMultipleIndex-6            3200          240           -92.50%
BenchmarkMasterIndexLookupSingleIndexUnknown-6       1232          48            -96.10%
BenchmarkMasterIndexLookupMultipleIndexUnknown-6     4272          128           -97.00%
…wnBlob

Index.Has() is a faster then Index.Lookup() for checking if a blob exists
in the index.  As the returned data is never used, this avoids a ton
of allocations.
The logging in these functions double the time they take to execute.
However, it is only really useful on failures, which are better
reported by the calling functions.

benchmark                                            old ns/op     new ns/op     delta
BenchmarkMasterIndexLookupSingleIndex-6              897           395           -55.96%
BenchmarkMasterIndexLookupMultipleIndex-6            2001          1090          -45.53%
BenchmarkMasterIndexLookupSingleIndexUnknown-6       492           215           -56.30%
BenchmarkMasterIndexLookupMultipleIndexUnknown-6     1649          912           -44.69%

benchmark                                            old allocs     new allocs     delta
BenchmarkMasterIndexLookupSingleIndex-6              9              1              -88.89%
BenchmarkMasterIndexLookupMultipleIndex-6            19             1              -94.74%
BenchmarkMasterIndexLookupSingleIndexUnknown-6       6              0              -100.00%
BenchmarkMasterIndexLookupMultipleIndexUnknown-6     16             0              -100.00%

benchmark                                            old bytes     new bytes     delta
BenchmarkMasterIndexLookupSingleIndex-6              160           96            -40.00%
BenchmarkMasterIndexLookupMultipleIndex-6            240           96            -60.00%
BenchmarkMasterIndexLookupSingleIndexUnknown-6       48            0             -100.00%
BenchmarkMasterIndexLookupMultipleIndexUnknown-6     128           0             -100.00%
Add some benchmarks for calling Log, both with a static string
along with calling the ID.Str and ID.String functions.
@MJDSys MJDSys force-pushed the more_index_lookup_avoids branch from d5b0fa6 to fe33c05 Compare January 24, 2018 03:31
@MJDSys
Copy link
Copy Markdown
Contributor Author

MJDSys commented Jan 24, 2018

Code rebased and changelog updated! Hopefully that's it :)

Out of curiosity, do you see benchmark results similar to mine with my benchmark code? I'm just curious why you don't see the difference in yours.

Copy link
Copy Markdown
Member

@fd0 fd0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@fd0
Copy link
Copy Markdown
Member

fd0 commented Jan 24, 2018

FYI, Travis is currently broken and does not run our tests on Darwin, so it looks like the tests failed (but they did not).


https://github.com/restic/restic/issues/1234
https://github.com/restic/restic/pull/55555
https://forum.restic/.net/foo/bar/baz
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've corrected the changelog entry, somehow the default URLs were still in it.

@fd0 fd0 merged commit fe33c05 into restic:master Jan 24, 2018
fd0 added a commit that referenced this pull request Jan 24, 2018
More optimizations to avoid calling Index.Lookup()
@fd0
Copy link
Copy Markdown
Member

fd0 commented Jan 25, 2018

Ah, I forgot to run the benchmark again, sorry. I'll try to do that later today (also curious).

@fd0
Copy link
Copy Markdown
Member

fd0 commented Jan 25, 2018

Hm, I think I did something wrong in my benchmarks (I don't have them any more), I'm seeing the same results as you did.

@fd0 fd0 mentioned this pull request Jan 25, 2018
@fd0
Copy link
Copy Markdown
Member

fd0 commented Jan 25, 2018

I've opened #1582 which removes all manual calls of id.Str() in debug.Log() and calls it afterwards in Log() via reflection.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants