Rewriting HeaderString to use absl::variant. See #9593#9952
Rewriting HeaderString to use absl::variant. See #9593#9952mattklein123 merged 10 commits intoenvoyproxy:masterfrom
Conversation
Signed-off-by: Adi Suissa-Peleg <adip@google.com>
|
Thanks for working on this. Haven't reviewed yet, but just wanted to say that I'm not surprised there is some perf delta, though it would be worth it to understand where the delta is maybe by taking a look at a cachegrind run? IMO some perf delta is probably worth it given the code simplification but very happy to hear what others think on this. |
Signed-off-by: Adi Suissa-Peleg <adip@google.com>
|
I'll look into the detailed results of cachegrind and submit the fixed PR tomorrow morning. |
|
|
||
| void HeaderString::clear() { | ||
| switch (type_) { | ||
| char* HeaderString::buffer() { |
There was a problem hiding this comment.
I'm assuming this is for the safety of not modifying a const reference string, correct? We might just want to get rid of this non-const accessor entirely.
I think the only call-site of this method is in the H/1 codec where we access the buffer and lower-case it. Can we just append lower-cased data to the HeaderString instead of lower-casing the appended data AND GET RID OF THIS METHOD FOREVER?!
There was a problem hiding this comment.
IIUC it is also being used by HeaderMapImpl::addViaMove
envoy/source/common/http/header_map_impl.cc
Lines 394 to 395 in e176b30
I think the next step is to refactor header_map, so I'll remember this comment, and see if we can remove the clear method.
There was a problem hiding this comment.
+1 please kill it if possible.
There was a problem hiding this comment.
I've removed the buffer() method.
When thinking how to add only lowercase strings, I was faced with 2 options, as the given data is const (
envoy/source/common/http/http1/codec_impl.cc
Line 473 in e176b30
- Allocate a new buffer, copy
data, apply the lowercase function, and append the outcome to the HeaderString (=> will copy the data twice). - Append the original data (that may require lowercasing), and apply a "transform" function on the HeaderString (=> will copy the data once, and then lowercase in-place).
I've decided to use option 2, and introduce a new method calledinlineTransform, that calls std::transform on the data. It should use less memory.
Let me know if you think I should use option 1, or if you have a better idea.
There was a problem hiding this comment.
Ugh. Option (1) is pretty bad. It bothers me that there are at least 3 ways we tolower strings, including (1) LowerCaseString with ctype.h tolower, (2) the fast lookup ToLowerTable like here, and (3) StringUtil::toLower using absl::ascii_tolower (and also the absl::AsciiStrToLower string version)
Curious, is (3) fast? It seems to something similar to our lower table. CAN WE GET RID OF THE LOWER TABLE TOO AND JUST USE absl::ascii_tolower AS THE UNARY_OP?
There was a problem hiding this comment.
Welcome to leave as TODO since (2) gives the option of switching out the unary op anyway
There was a problem hiding this comment.
Good point. I'll change it to use absl::ascii_tolower, and get rid of the lower table.
… See envoyproxy#9593 Signed-off-by: Adi Suissa-Peleg <adip@google.com>
mattklein123
left a comment
There was a problem hiding this comment.
Awesome! Really great to see this happening. Love the red lines. Flushing out some comments from a first round and then I will take another pas. Thank you!
/wait
include/envoy/http/header_map.h
Outdated
| @@ -89,12 +90,12 @@ using LowerCaseStrPairVector = | |||
| * This is a string implementation for use in header processing. It is heavily optimized for | |||
| * performance. It supports 3 different types of storage and can switch between them: | |||
include/envoy/http/header_map.h
Outdated
| * @return the type of backing storage for the string. | ||
| */ | ||
| Type type() const { return type_; } | ||
| Type type() const { return Type(buffer_.index()); } |
There was a problem hiding this comment.
I thought you mentioned that we can get rid of this now?
There was a problem hiding this comment.
Sorry, I meant that we can get rid of the "Dynamic" type.
I'll look more closely if we can remove the type() method, as it seems to only be used at:
envoy/source/common/http/http2/codec_impl.cc
Lines 64 to 71 in d1db564
There was a problem hiding this comment.
Removed access to type() and the Type enum is now private.
Added the isReference method that is used by the codec.
| static_assert(sizeof(inline_buffer_) >= MaxIntegerLength, ""); | ||
| // Initialize as a Type::Inline | ||
| HeaderString::HeaderString() : buffer_(absl::InlinedVector<char, 128>()), string_length_(0) { | ||
| static_assert(sizeof(buffer_) >= MaxIntegerLength, ""); |
There was a problem hiding this comment.
Does this make sense anymore? Should you be checking the size of the inline storage somehow? Effectively the minimum size of the vector backing data?
| } | ||
| } | ||
| HeaderString::HeaderString(HeaderString&& move_value) noexcept | ||
| : buffer_(move_value.buffer_), string_length_(move_value.string_length_) { |
There was a problem hiding this comment.
I think this might copy. I would store the right hand side string length in a local variable and then do an explicit std::move?
There was a problem hiding this comment.
Also removed string_length_, as it is not really necessary anymore.
| #define BUFFER_STR_VIEW_GET absl::get<absl::string_view>(buffer_) | ||
| #define BUFFER_IN_VEC_GET absl::get<absl::InlinedVector<char, 128>>(buffer_) |
There was a problem hiding this comment.
nit: prefer functions over macros.
| uint64_t new_capacity = static_cast<uint64_t>(size) + string_length_; | ||
| if (new_capacity < MinDynamicCapacity) { | ||
| new_capacity = MinDynamicCapacity; | ||
| } | ||
| validateCapacity(new_capacity); |
There was a problem hiding this comment.
I don't think any of this should be needed anymore?
There was a problem hiding this comment.
Kept validateCapacity to avoid OOM when appending.
Let me know if you think it is unnecessary.
|
|
||
| void HeaderString::clear() { | ||
| switch (type_) { | ||
| char* HeaderString::buffer() { |
There was a problem hiding this comment.
+1 please kill it if possible.
| buffer_.dynamic_ = inline_buffer_; | ||
| const uint32_t max_buffer_length = 32; | ||
| char inner_buffer[max_buffer_length]; | ||
| string_length_ = StringUtil::itoa(inner_buffer, max_buffer_length, value); |
There was a problem hiding this comment.
You should be able to itoa directly into the buffer after reserving it and avoid the copy?
There was a problem hiding this comment.
I've attempted the following:
BUFFER_IN_VEC_GET.resize(MaxIntegerLength);
const uint32_t int_length = StringUtil::itoa(BUFFER_IN_VEC_GET.data(), MaxIntegerLength, value);
BUFFER_IN_VEC_GET.resize(int_length);
but unfortunately it has ~100ns perf penalty (the first resize also initializes the 32 cells, and the second one "destroys" the unused ones).
If you think of a better way, let me know and I'll implement it and test it out.
… header_map_simplify Signed-off-by: Adi Suissa-Peleg <adip@google.com>
… See envoyproxy#9593 Cleaning up unneeded members, hiding Type enum, removing type() and data() methods, and introducing the inlineTransform method. Signed-off-by: Adi Suissa-Peleg <adip@google.com>
… See envoyproxy#9593 Cleaning up unneeded members, hiding Type enum, removing type() and data() methods, and introducing the inlineTransform method. Signed-off-by: Adi Suissa-Peleg <adip@google.com>
|
After removing some unnecessary things, fixing some small issues, and running the performance tests (test/common/http:header_map_impl_speed_test), we've got the following results (new version called "w/ absl::variant 2"): Seems that most of the functions are a bit faster than the original implementation. |
|
Sorry do you mind merging master and I will take another pass? /wait |
Signed-off-by: Adi Suissa-Peleg <adip@google.com>
Signed-off-by: Adi Suissa-Peleg <adip@google.com>
|
Let me know when you are done addressing @asraa comments and I will take a look. Thank you! /wait |
|
@mattklein123 I think it would be better to remove to_lower_table and call ascii_tolower instead, in a different PR, as it has implications on other parts of the code. |
|
Sounds good I will review. |
include/envoy/http/header_map.h
Outdated
| /** | ||
| * @return the type of backing storage for the string. | ||
| */ | ||
| Type type() const { return Type(buffer_.index()); } |
There was a problem hiding this comment.
Can you add a comment above near Type/buffer_ that this function relies on the order of the enum matching the variant order? Potentially fragile? Potentially add some assert around this? Up to you on the assert.
There was a problem hiding this comment.
I like the idea of the assert, should make it more robust. Will add it.
|
|
||
| } // namespace | ||
| absl::string_view | ||
| get_str_view(const absl::variant<absl::string_view, absl::InlinedVector<char, 128>>& buffer) { |
There was a problem hiding this comment.
nit: maybe add a using declaration to alias absl::variant<absl::string_view, absl::InlinedVector<char, 128>>? You could also sub alias absl::InlinedVector<char, 128>
There was a problem hiding this comment.
I used a macro before. Do you suggest a typedef or something else?
There was a problem hiding this comment.
using InlineHeaderVector = absl::InlinedVector<char, 128>;
using VariantHeader = absl::variant<absl::string_view, InlineHeaderVector>;
Etc.
There was a problem hiding this comment.
Will do. Thanks for the guidance.
| ASSERT(valid()); | ||
| } | ||
|
|
||
| HeaderString::~HeaderString() {} |
| buffer_.dynamic_ = buf; | ||
| // Rather than be too clever and optimize this uncommon case, we switch to | ||
| // Inline mode and copy. | ||
| absl::string_view prev = get_str_view(buffer_); |
There was a problem hiding this comment.
nit: const, elsewhere where applicable if possible.
| // Initialize the size to the max length, copy the actual data, and then | ||
| // reduce the size (but not the capacity) as needed |
There was a problem hiding this comment.
Can you add a small note here about your perf experiment so that someone doesn't come along and ask the same question as I had? It's not intuitive to me that this is after versus an inline write, but I certainly believe your results.
|
|
||
| if (!current_header_field_.empty()) { | ||
| toLowerTable().toLowerCase(current_header_field_.buffer(), current_header_field_.size()); | ||
| current_header_field_.inlineTransform([](char c) { return static_cast<uint8_t>(tolower(c)); }); |
There was a problem hiding this comment.
Does this change have a perf implication? Should we do the absl thing here now?
There was a problem hiding this comment.
Changed to abseil's tolower version (ascii_tolower) now.
Compared performance of absl::to_lower to toLowerCase (not going to commit the test, just adding the results, to save these datapoints). absl is a bit faster, probably due to optimizations, so we should use it.
ToLowerTableEnvoyLowerCase/0 2.11 ns 2.11 ns 280473610
ToLowerTableEnvoyLowerCase/1 4.17 ns 4.17 ns 122310238
ToLowerTableEnvoyLowerCase/10 8.98 ns 8.98 ns 78490730
ToLowerTableEnvoyLowerCase/50 26.3 ns 26.3 ns 26710790
ToLowerTableEnvoyLowerCase/200 81.3 ns 81.3 ns 8631624
ToLowerTableAbseilLowerCase/0 1.91 ns 1.91 ns 365400798
ToLowerTableAbseilLowerCase/1 3.68 ns 3.68 ns 190285175
ToLowerTableAbseilLowerCase/10 8.94 ns 8.94 ns 77216159
ToLowerTableAbseilLowerCase/50 24.1 ns 24.1 ns 28907364
ToLowerTableAbseilLowerCase/200 72.6 ns 72.6 ns 9664473
…stead of tolower Signed-off-by: Adi Suissa-Peleg <adip@google.com>
Signed-off-by: Adi Suissa-Peleg <adip@google.com>
|
Fuzz approved, thanks for adding coverage for both inline/reference header fields and |
mattklein123
left a comment
There was a problem hiding this comment.
Awesome work!!! Love the simplification here.
|
Looks like a side effect of this PR is the building on RHEL 7 now breaks. Error is below building on RHEL 7 with gcc 8 and |
|
Thanks.
Interesting, it seems that the field might not be initialized in these
settings.
I'll try to see if the error can be reproduced or if the member can be
explicitly initialized.
…On Tue, Feb 18, 2020, 04:22 moderation ***@***.***> wrote:
Looks like a side effect of this PR is the building on RHEL 7 now breaks.
Error is below building on RHEL 7 with gcc 8 and bazel build -c opt
--copt=-DENVOY_IGNORE_GLIBCXX_USE_CXX11_ABI_ERROR=1
//source/exe:envoy-static.stripped. Any workarounds possible? /cc @asraa
<https://github.com/asraa>
ERROR: /home/<USERNAME>/Library/envoyproxy/envoy/source/common/http/BUILD:256:1: C++ compilation of rule '//source/common/http:header_map_lib' failed (Exit 1) gcc failed: error executing command /opt/rh/devtoolset-8/root/usr/bin/gcc -U_FORTIFY_SOURCE -fstack-protector -Wall -Wunused-but-set-parameter -Wno-free-nonheap-object -fno-omit-frame-pointer -g0 -O2 '-D_FORTIFY_SOURCE=1' -DNDEBUG ... (remaining 86 argument(s) skipped)
Use --sandbox_debug to see verbose messages from the sandbox
In file included from external/com_google_absl/absl/container/inlined_vector.h:53,
from bazel-out/k8-opt/bin/include/envoy/http/_virtual_includes/header_map_interface/envoy/http/header_map.h:18,
from bazel-out/k8-opt/bin/source/common/http/_virtual_includes/header_map_lib/common/http/header_map_impl.h:9,
from source/common/http/header_map_impl.cc:1:
external/com_google_absl/absl/container/internal/inlined_vector.h: In constructor 'Envoy::Http::HeaderString::HeaderString()':
external/com_google_absl/absl/container/internal/inlined_vector.h:437:5: error: '<anonymous>.absl::inlined_vector_internal::Storage<char, 128, std::allocator<char> >::data_' is used uninitialized in this function [-Werror=uninitialized]
data_ = other_storage.data_;
^~~~~
external/com_google_absl/absl/container/internal/inlined_vector.h: In member function 'void Envoy::Http::HeaderString::setInteger(uint64_t)':
external/com_google_absl/absl/container/internal/inlined_vector.h:437:5: error: '<anonymous>.absl::inlined_vector_internal::Storage<char, 128, std::allocator<char> >::data_' may be used uninitialized in this function [-Werror=maybe-uninitialized]
data_ = other_storage.data_;
^~~~~
external/com_google_absl/absl/container/internal/inlined_vector.h:437:5: error: '<anonymous>.absl::inlined_vector_internal::Storage<char, 128, std::allocator<char> >::data_' may be used uninitialized in this function [-Werror=maybe-uninitialized]
data_ = other_storage.data_;
^~~~~
external/com_google_absl/absl/container/internal/inlined_vector.h:437:5: error: '<anonymous>.absl::inlined_vector_internal::Storage<char, 128, std::allocator<char> >::data_' may be used uninitialized in this function [-Werror=maybe-uninitialized]
data_ = other_storage.data_;
^~~~~
external/com_google_absl/absl/container/internal/inlined_vector.h: In member function 'void Envoy::Http::HeaderString::setCopy(const char*, uint32_t)':
external/com_google_absl/absl/container/internal/inlined_vector.h:437:5: error: '<anonymous>.absl::inlined_vector_internal::Storage<char, 128, std::allocator<char> >::data_' may be used uninitialized in this function [-Werror=maybe-uninitialized]
data_ = other_storage.data_;
^~~~~
external/com_google_absl/absl/container/internal/inlined_vector.h:437:5: error: '<anonymous>.absl::inlined_vector_internal::Storage<char, 128, std::allocator<char> >::data_' may be used uninitialized in this function [-Werror=maybe-uninitialized]
data_ = other_storage.data_;
^~~~~
external/com_google_absl/absl/container/internal/inlined_vector.h: In member function 'void Envoy::Http::HeaderString::append(const char*, uint32_t)':
external/com_google_absl/absl/container/internal/inlined_vector.h:437:5: error: '<anonymous>.absl::inlined_vector_internal::Storage<char, 128, std::allocator<char> >::data_' may be used uninitialized in this function [-Werror=maybe-uninitialized]
data_ = other_storage.data_;
^~~~~
cc1plus: all warnings being treated as errors
Target //source/exe:envoy-static.stripped failed to build
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#9952?email_source=notifications&email_token=ABEQZZZCRXCTQN5NPXCC3V3RDOSFHA5CNFSM4KRETFU2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEMBGL6Y#issuecomment-587359739>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABEQZZYVSMI4GTUSR5O4NWDRDOSFHANCNFSM4KRETFUQ>
.
|
Do you get the same error running |
|
@moderation I've attempted to compile it with GCC, but couldn't reproduce the problem. |
|
@adisuissa I think the issue is specific to RHEL 7 (and possibly Centos) due to the ancient glibc. I suspect using GCC 8 on something like Ubuntu will trigger the issue. @asraa can you give me a hint on how to run that specific test? What is the Bazel label that I would use as outlined at https://github.com/envoyproxy/envoy/tree/master/bazel#testing-envoy-with-bazel ? |
|
It might be hackish, but I cd'd in to my bazel cache, in to the |
|
That worked @asraa ! I confirmed the test passed on my Xubuntu 19.10 with Clang 9. Output from work RHEL 7 is below: |
|
Oh jeez. I was hoping to reproduce the same error frm the test, but it looks like building the absl test fails in itself. Maybe @adisuissa can write a simple inlinedvector test (no headermap stuff, just replicating the initialization and usage of variant and inlined vector in headermap) for you to add to an Envoy test and run in your environment? |
|
Big thanks to @asraa and @adisuissa for helping out on this one. I was able to install GCC 9.1.1 in my Red Hat 7 environment and now I'm able to compile without errors using |
|
FWIW: I am still getting this error in debian-bullseye (gcc 9.x), I have to use the following options:
and in debian-buster (gcc 8.x):
|


Signed-off-by: Adi Suissa-Peleg adip@google.com
Description: Rewriting HeaderString to use absl::variant instead of union. Removed Type::Dynamic, updated the implementation, and tests.
Risk Level: Low/Medium - being used by the header_map, et al.
Testing: Kept and updated most unit tests as before. Removed one unit test that accessed a region of memory that was out of scope.
Docs Changes:
Release Notes:
Fixes #9593