Skip to content

improvements in filtering image#1652

Merged
jhump merged 5 commits intomainfrom
jh/leaner-filtered-image
Dec 8, 2022
Merged

improvements in filtering image#1652
jhump merged 5 commits intomainfrom
jh/leaner-filtered-image

Conversation

@jhump
Copy link
Member

@jhump jhump commented Dec 7, 2022

This addresses a TODO about using the protosource model. The assumption in that TODO was that it was heavy-weight.

So this PR first adds a benchmark (see first commit), to evaluate if the change actually improves performance.

The second commit is the change that addresses the TODO (which did have an impact on performance). It adds more information to the imageIndex so that the descriptor protos can be used, instead of having to first wrap the whole hierarchy in protosource types.

The third commit is another change that IMO makes the code easier to read, but also improves performance a little further by keeping accumulator variables in a new transitiveClosure struct, instead of each recursive call having to repeatedly concatenate slices. This is also a pre-factor for addressing other TODOs, which will require tracking new state, which is much easier to add with this new approach, instead of having to change the return values and then update the accumulation logic at many recursive call sites. It also removes some redundant tracking: we were tracking the visited elements in both the returned descriptorAndDependents type and in the seen map. Now it's just a map in the transitiveClosure struct, and imports are tracked differently.

Benchmark results at the start (i.e. for the current implementation on main):

BenchmarkFilterImage-10    	   15546	     76453 ns/op	   83676 B/op	    2222 allocs/op

After the first commit (to just use descriptor protos instead of protosource):

BenchmarkFilterImage-10    	   21099	     55813 ns/op	   55538 B/op	     306 allocs/op

And after introducing the transitiveClosure accumulator type:

BenchmarkFilterImage-10    	   28183	     45259 ns/op	   50090 B/op	     187 allocs/op

So in the end, it's 1.7x as fast, uses 60% as much memory, with less than 1/10th as many allocations.

Copy link
Contributor

@robbertvanginkel robbertvanginkel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice, those perf numbers look great! I can definitely see how this will be easier to extend, reads better now!

Small comment on the TestMain, but other than that looks good.

fileName := file.Path()
fileDescriptorProto := file.Proto()
index.Files[fileName] = fileDescriptorProto
err := walk.DescriptorProtos(fileDescriptorProto, func(name protoreflect.FullName, msg proto.Message) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:nice:

@jhump jhump merged commit 1289d7d into main Dec 8, 2022
@jhump jhump deleted the jh/leaner-filtered-image branch December 8, 2022 02:08
Monirul1 pushed a commit to Monirul1/buf that referenced this pull request Apr 30, 2023
This adds a performance benchmark for the image filtering logic, to make
sure the changes actually improve performance.

Performance is improved in two key ways:
1. Shedding the `protosource` model wrappers and instead directly using
   descriptor protos.
2. Accumulating results in one place, instead of repeated allocation and then
   concatenation of slices. (This is also useful pre-factoring for adding more
   improvements in the future.)

The net result is that it's now 1.7x as fast, uses 60% as much memory, and
performs less than 1/10th as many allocations.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants