stats: faster versions of SymbolTableImpl::lessThan and StatName::startsWith by jmarantz · Pull Request #19563 · envoyproxy/envoy

jmarantz · 2022-01-16T17:07:25Z

Commit Message: Improvements in the admin panel (#19546 #18670) put greater pressure on sorting of stats via StatName, as opposed to saving elaborated strings for every stat and sorting by that.

This new implementation provides an iterator-like interface for decoding the tokens in StatName, enabling us to early-exit when comparing stat names with multiple tokens. For example, if you are comparing "a.b.c.d" against "a.x.y.z" we can abort out after the "b" vs "x" comparison, and there is no need to compare "c" to "y". Of course it's not as fast as comparing strings, but it saves having to hold the elaborated strings in memory.

It also adds a new sortByStatNames free function which is more efficient than calling std::sort directly because it can take the symbol-table lock once for the whole sort, rather than re-taking the lock on each comparison. Taking uncontended locks is fast, but as the benchmark shows, it's not as fast as not taking locks.

Additional Description:
A benchmark for this is added in this PR:
OLD:

------------------------------------------------------------
Benchmark                  Time             CPU   Iterations
------------------------------------------------------------
bmCompareElements        174 ns          174 ns      4028127
bmStdSort          318591243 ns    318515288 ns            2

NEW:

------------------------------------------------------------
Benchmark                  Time             CPU   Iterations
------------------------------------------------------------
bmCompareElements       55.9 ns         55.9 ns     12515696
bmSortByStatNames   61342554 ns     61329512 ns           11
bmStdSort           80578659 ns     80562977 ns            9

The raw CompareElements numbers show the raw speed improvement offered by the early-exit as well as refraining from creating temp vectors, but count on 20M compares to sort 1M stats.

The sort numbers provide more context. in the old code, sorting 100k stats (from 1k clusters) takes 318ms (manually testing 1M stats is around 2.8 seconds but I didn't want to have CI run such a long test each time). Doing that using std::sort with the new comparison algorithm is 81ms and with the optimized sortByStatNames is 61ms.

The burst of main-thread activity due to an admin /stats request may impact long-tail data-plane latency on a heavily loaded system. which is why in #18670 we segment into scopes. In the meantime, this reduces the speed impact of sorting.

Risk Level: medium -- hopefully the existing unit tests covered all interesting corner cases
Testing: //test/common/stats/... plus the new benchmark
Docs Changes: n/a
Release Notes: n/a
Platform Specific Features: n/a

Signed-off-by: Joshua Marantz <jmarantz@google.com>

jmarantz · 2022-01-16T18:25:51Z

/retest

repokitteh-read-only · 2022-01-16T18:25:54Z

Retrying Azure Pipelines:
Check envoy-presubmit isn't fully completed, but will still attempt retrying.
Retried failed jobs in: envoy-presubmit

🐱

Caused by: a #19563 (comment) was created by @jmarantz.

see: more, trace.

Signed-off-by: Joshua Marantz <jmarantz@google.com>

…mparison. Signed-off-by: Joshua Marantz <jmarantz@google.com>

Signed-off-by: Joshua Marantz <jmarantz@google.com>

pradeepcrao

Will be interesting to see what the benchmarks say once the class hierarchy is removed, eliminating vtable lookups..

source/common/stats/symbol_table_impl.cc

source/common/stats/symbol_table_impl.h

source/common/stats/symbol_table_impl.cc

Signed-off-by: Joshua Marantz <jmarantz@google.com>

…her than array+size Signed-off-by: Joshua Marantz <jmarantz@google.com>

jmarantz

thanks for the review!

source/common/stats/symbol_table_impl.cc

source/common/stats/symbol_table_impl.h

Signed-off-by: Joshua Marantz <jmarantz@google.com>

pradeepcrao · 2022-01-22T00:35:18Z

LGTM

snowp

Thanks this looks reasonable to me, a few comments

snowp · 2022-01-26T14:42:33Z

source/common/stats/symbol_table_impl.cc

+    TokenIter::TokenType prefix_type = prefix_iter.next();
+    TokenIter::TokenType this_type = this_iter.next();
+    if (prefix_type == TokenIter::TokenType::End) {
+      break; // "a.b.c" starts with "a.b" or "a.b.c"


Maybe just use return here? It's a bit confusing having both break and return control flows here imo

You raise a good point. It's a bit annoying though to have a while(true) loop inside a function that returns a value, without a break, because I don't know how to get both gcc and clang to compile without complaint.

I was trying to use gcc to make debugging with gdb better (though I failed), but other people might want to compile gcc without warnings.

Now I can't reproduce the problem I thought this solved, so I went with all 'returns' in the loop, and an unreachable 'return true' outside it (with a comment).

source/common/stats/symbol_table_impl.cc

Signed-off-by: Joshua Marantz <jmarantz@google.com>

snowp

LGTM, thanks!

We should probably have another maintainer sign off on this, any suggestions for who?

jmarantz · 2022-01-28T15:48:58Z

Agreed - I think @ggreenway might have the most context - I think the Prometheus formatter he wrote might be the first beneficiary of this.

snowp · 2022-02-01T13:25:17Z

@ggreenway ping

ggreenway

Makes sense to me

Commit Message: Removes the pure interface for symbol tables, as there is now only one implementation. It was originally added to enable a fake implementation while the impl and its usage was being refined, but now the interface layer just gets in the way. For example it's not possible to add a virtual template method to SymbolTable which would have been useful in #19563 Additional Description: Risk Level: low Testing: //test/... Docs Changes: n/a Release Notes: n/a Platform Specific Features: n/a Signed-off-by: Joshua Marantz <jmarantz@google.com>

…rtsWith (envoyproxy#19563) Commit Message: Improvements in the admin panel (envoyproxy#19546 envoyproxy#18670) put greater pressure on sorting of stats via StatName, as opposed to saving elaborated strings for every stat and sorting by that. This new implementation provides an iterator-like interface for decoding the tokens in StatName, enabling us to early-exit when comparing stat names with multiple tokens. For example, if you are comparing "a.b.c.d" against "a.x.y.z" we can abort out after the "b" vs "x" comparison, and there is no need to compare "c" to "y". Of course it's not as fast as comparing strings, but it saves having to hold the elaborated strings in memory. It also adds a new sortByStatNames free function which is more efficient than calling std::sort directly because it can take the symbol-table lock once for the whole sort, rather than re-taking the lock on each comparison. Taking uncontended locks is fast, but as the benchmark shows, it's not as fast as not taking locks. Additional Description: A benchmark for this is added in this PR: OLD: ``` ------------------------------------------------------------ Benchmark Time CPU Iterations ------------------------------------------------------------ bmCompareElements 174 ns 174 ns 4028127 bmStdSort 318591243 ns 318515288 ns 2 ``` NEW: ``` ------------------------------------------------------------ Benchmark Time CPU Iterations ------------------------------------------------------------ bmCompareElements 55.9 ns 55.9 ns 12515696 bmSortByStatNames 61342554 ns 61329512 ns 11 bmStdSort 80578659 ns 80562977 ns 9 ``` The raw CompareElements numbers show the raw speed improvement offered by the early-exit as well as refraining from creating temp vectors, but count on 20M compares to sort 1M stats. The sort numbers provide more context. in the old code, sorting 100k stats (from 1k clusters) takes 318ms (manually testing 1M stats is around 2.8 seconds but I didn't want to have CI run such a long test each time). Doing that using std::sort with the new comparison algorithm is 81ms and with the optimized sortByStatNames is 61ms. The burst of main-thread activity due to an admin /stats request may impact long-tail data-plane latency on a heavily loaded system. which is why in envoyproxy#18670 we segment into scopes. In the meantime, this reduces the speed impact of sorting. Risk Level: medium -- hopefully the existing unit tests covered all interesting corner cases Testing: //test/common/stats/... plus the new benchmark Docs Changes: n/a Release Notes: n/a Platform Specific Features: n/a Signed-off-by: Joshua Marantz <jmarantz@google.com> Signed-off-by: Josh Perry <josh.perry@mx.com>

Commit Message: Removes the pure interface for symbol tables, as there is now only one implementation. It was originally added to enable a fake implementation while the impl and its usage was being refined, but now the interface layer just gets in the way. For example it's not possible to add a virtual template method to SymbolTable which would have been useful in envoyproxy#19563 Additional Description: Risk Level: low Testing: //test/... Docs Changes: n/a Release Notes: n/a Platform Specific Features: n/a Signed-off-by: Joshua Marantz <jmarantz@google.com> Signed-off-by: Josh Perry <josh.perry@mx.com>

jmarantz added 3 commits January 16, 2022 11:59

faster versions of SymbolTableImpl::lessThan and StatName::StartsWith

545de6c

Signed-off-by: Joshua Marantz <jmarantz@google.com>

remove #if'd out code

11fcb3a

Signed-off-by: Joshua Marantz <jmarantz@google.com>

tweak comments

6f8845c

Signed-off-by: Joshua Marantz <jmarantz@google.com>

jmarantz changed the title ~~stats: faster versions of SymbolTableImpl::lessThan and StatName::StartsWith~~ stats: faster versions of SymbolTableImpl::lessThan and StatName::startsWith Jan 16, 2022

jmarantz added 2 commits January 16, 2022 19:35

Merge branch 'main' into symtab-compare-speed

fd26046

Signed-off-by: Joshua Marantz <jmarantz@google.com>

Merge branch 'main' into symtab-compare-speed

316afa3

Signed-off-by: Joshua Marantz <jmarantz@google.com>

jmarantz assigned snowp Jan 17, 2022

jmarantz added 4 commits January 17, 2022 23:50

Add a sorting API to symbol table that avoids taking locks on each co…

8fffb23

…mparison. Signed-off-by: Joshua Marantz <jmarantz@google.com>

cleanup comments

31abd0d

Signed-off-by: Joshua Marantz <jmarantz@google.com>

remove extraneous 'using' declaration and add some TODOs.

bfb9df8

Signed-off-by: Joshua Marantz <jmarantz@google.com>

adds benchmarks for sorting.

f7ed328

Signed-off-by: Joshua Marantz <jmarantz@google.com>

This was referenced Jan 18, 2022

admin: Richer HTML home page with forms for params #19546

Merged

stats: remove symbol table interface #19597

Merged

jmarantz added 3 commits January 18, 2022 19:10

Merge branch 'main' into symtab-compare-speed

5f30014

Signed-off-by: Joshua Marantz <jmarantz@google.com>

add string-sort for comparison.

e914bb0

Signed-off-by: Joshua Marantz <jmarantz@google.com>

tidy

9f50b44

Signed-off-by: Joshua Marantz <jmarantz@google.com>

jmarantz assigned pradeepcrao Jan 20, 2022

Merge branch 'main' into symtab-compare-speed

daae1b4

Signed-off-by: Joshua Marantz <jmarantz@google.com>

pradeepcrao reviewed Jan 21, 2022

View reviewed changes

jmarantz added 3 commits January 21, 2022 16:39

review comments

e92b534

Signed-off-by: Joshua Marantz <jmarantz@google.com>

add StatName ctor for TokenIter

64237e9

Signed-off-by: Joshua Marantz <jmarantz@google.com>

simplify TokenIter constructor and decode APIs by taking StatName rat…

1352e97

…her than array+size Signed-off-by: Joshua Marantz <jmarantz@google.com>

jmarantz commented Jan 21, 2022

View reviewed changes

add debug-only type retention for assertions

db468ac

Signed-off-by: Joshua Marantz <jmarantz@google.com>

snowp suggested changes Jan 26, 2022

View reviewed changes

mattklein123 added the waiting label Jan 26, 2022

Merge branch 'main' into symtab-compare-speed

348b9f2

Signed-off-by: Joshua Marantz <jmarantz@google.com>

use return consistantly in while/true loop

fff52d3

Signed-off-by: Joshua Marantz <jmarantz@google.com>

repokitteh-read-only bot removed the waiting label Jan 26, 2022

snowp approved these changes Jan 28, 2022

View reviewed changes

jmarantz assigned ggreenway Jan 28, 2022

ggreenway approved these changes Feb 1, 2022

View reviewed changes

jmarantz merged commit aded9c3 into envoyproxy:main Feb 1, 2022

jmarantz deleted the symtab-compare-speed branch February 1, 2022 20:38

Conversation

jmarantz commented Jan 16, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jmarantz commented Jan 16, 2022

Uh oh!

repokitteh-read-only bot commented Jan 16, 2022

Uh oh!

pradeepcrao left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jmarantz left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

pradeepcrao commented Jan 22, 2022

Uh oh!

snowp left a comment

Choose a reason for hiding this comment

Uh oh!

snowp Jan 26, 2022

Choose a reason for hiding this comment

Uh oh!

jmarantz Jan 26, 2022

Choose a reason for hiding this comment

Uh oh!

Uh oh!

snowp left a comment

Choose a reason for hiding this comment

Uh oh!

jmarantz commented Jan 28, 2022

Uh oh!

snowp commented Feb 1, 2022

Uh oh!

ggreenway left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

jmarantz commented Jan 16, 2022 •

edited

Loading