Make the total number of stats in shared memory, and the maximum length#1817
Conversation
of a stat name, configurable at compile-time and run-time. fixes envoyproxy#1761 Signed-off-by: Greg Greenway <ggreenway@apple.com>
|
This may be out of the scope of this PR, but with this change, we can also vary the maximum cluster name length: envoy/source/common/json/config_schemas.cc Line 1352 in 8a09d86 |
|
I'm not opposed to fixing that as well; how much will it upset things to make the json schema non-constant? It might be best to make that a separate PR if it requires much work. But the biggest reason for making the name length configurable is to allow longer cluster names, so it's certainly not unrelated. |
Signed-off-by: Greg Greenway <ggreenway@apple.com>
|
@ggreenway +1 making the size of cluster name dynamic in code. However I would recommend we do it in a follow up. In actuality, we need to dynamically limit the size of all user supplied things (listener name, route table name, etc.) so I think it would be great to do that in an isolated follow up. |
mattklein123
left a comment
There was a problem hiding this comment.
This is great. Some random comments. Also you probably saw but the coverage test is failing.
|
|
||
| void RawStatData::configure(Server::Options& options) { | ||
| const size_t configured = options.maxStatNameLength(); | ||
| RELEASE_ASSERT(configured > 0); |
There was a problem hiding this comment.
Can we gracefully fail this in options parsing that this is maybe >= the current default? Otherwise I think usability will be very bad.
| RawStatData* HeapRawStatDataAllocator::alloc(const std::string& name) { | ||
| RawStatData* data = new RawStatData(); | ||
| memset(data, 0, sizeof(RawStatData)); | ||
| RawStatData* data = static_cast<RawStatData*>(::calloc(RawStatData::size(), 1)); |
There was a problem hiding this comment.
nit: any reason to not just use malloc?
There was a problem hiding this comment.
It needs to be zero-initialized. I've always used calloc for that, even when I don't need it to multiply by an element-count.
There was a problem hiding this comment.
Ah OK. I would just add comment about zero initialize.
source/exe/main.cc
Outdated
| #ifdef ENVOY_HOT_RESTART | ||
| // Enabled by default, except on OS X. Control with "bazel --define=hot_restart=disabled" | ||
| const std::string shared_mem_version = Envoy::Server::SharedMemory::version(); | ||
| const Envoy::OptionsImpl::HotRestartVersionCB hot_restart_version_cb = |
There was a problem hiding this comment.
Or just auto, it should be inferable from its use in the call below.
| } | ||
|
|
||
| sockaddr_un HotRestartImpl::createDomainSocketAddress(uint64_t id) { | ||
| // Right now we only allow a maximum of 3 concurrent envoy processes to be running. When the third |
There was a problem hiding this comment.
any reason to reformat these comments?
There was a problem hiding this comment.
I think clang-format hit this; not sure why it got reformatted. I'll revert.
source/server/hot_restart_impl.cc
Outdated
| // Right now we only allow a maximum of 3 concurrent envoy processes to be running. When the third | ||
| // stats up it will kill the oldest parent. | ||
| // Right now we only allow a maximum of 3 concurrent envoy processes to be running. When the | ||
| // third stats up it will kill the oldest parent. |
There was a problem hiding this comment.
While you are here: s/stats/starts
source/server/hot_restart_impl.cc
Outdated
| // 3) P2 proceeds to the parent admin shutdown phase. | ||
| // 4) This races with P1 fetching parent stats from P0. | ||
| // 5) Calling receiveTypedRpc() below picks up the wrong message. | ||
| // There exists a race condition during hot restart involving fetching parent stats. It looks |
There was a problem hiding this comment.
same can we not reformat unless necessary?
There was a problem hiding this comment.
Again, clang-format. I think I'm on the exact same version as in the docker image; has this file not been touched since clang-format-3.6 was used?
There was a problem hiding this comment.
all the files get checked, so some other mismatch maybe?
| #include "tclap/CmdLine.h" | ||
|
|
||
| #ifndef ENVOY_DEFAULT_MAX_STATS | ||
| #define ENVOY_DEFAULT_MAX_STATS 16384 |
There was a problem hiding this comment.
nit: C++ const class values for both of these.
There was a problem hiding this comment.
This was so that it can be overridden at compile-time. Do you want this define, and then a const class value using these #defines for the actual value?
There was a problem hiding this comment.
Ah OK. Please just add comment that this is for compile time override. We will need some docs on this in the bazel/README.md file.
|
|
||
| RawStatData* alloc(const std::string& name) override { | ||
| std::unique_ptr<RawStatData>& stat_ref = stats_[name]; | ||
| CSmartPtr<RawStatData, freeAdapter>& stat_ref = stats_[name]; |
There was a problem hiding this comment.
you can't just pass :free directly here?
There was a problem hiding this comment.
It wouldn't compile because ::free takes a void*, CSmartPtr wants a function that takes RawStatData*
|
Performance note: I did some quick tests on how long it took to allocate all the stats at various sizes. At the default 16k it was 1 second or so, at 100k it was 120 seconds. So as is, it will be painful to increase the number of stats too much beyond the default. |
htuch
left a comment
There was a problem hiding this comment.
Thanks for this, it will be useful to us at Google as well.
| // Round val up to the next multiple of the natural alignment. | ||
| // Note: this implementation only works because 8 is a power of 2. | ||
| size_t roundUpMultipleNaturalAlignment(size_t val) { | ||
| const size_t multiple = alignof(RawStatData); |
There was a problem hiding this comment.
Maybe add an ASSERT here that multiple is 8 or some other power-of-two. It's probably true for every architecture/compiler, but I have a slight preference for documentation via assertions.
There was a problem hiding this comment.
I agree with documentation by assertion 👍
source/common/stats/stats_impl.cc
Outdated
| return roundUpMultipleNaturalAlignment(sizeof(RawStatData) + nameSize()); | ||
| } | ||
|
|
||
| size_t& RawStatData::getMaxNameLength(size_t configured_size) { |
There was a problem hiding this comment.
Nit: it's a bit hard to parse this method name. It would at first appear to be a getter but it has side effects.
source/exe/main.cc
Outdated
| #ifdef ENVOY_HOT_RESTART | ||
| // Enabled by default, except on OS X. Control with "bazel --define=hot_restart=disabled" | ||
| const std::string shared_mem_version = Envoy::Server::SharedMemory::version(); | ||
| const Envoy::OptionsImpl::HotRestartVersionCB hot_restart_version_cb = |
There was a problem hiding this comment.
Or just auto, it should be inferable from its use in the call below.
| } // namespace | ||
|
|
||
| size_t RawStatData::size() { | ||
| return roundUpMultipleNaturalAlignment(sizeof(RawStatData) + nameSize()); |
There was a problem hiding this comment.
Can you document the rationale for natural alignment for size? AFAICT, this has to do with how stats are allocated in hot restart shmem, where they are packed together. This isn't obvious when reading the code in the file though.
rshriram
left a comment
There was a problem hiding this comment.
One request: would it be possible to benchmark the startup time with 128 characters as the cluster name length, 256 char length for entire stat name, and retaining the existing 16K stats?
If that works, it would really cleanup the stats story for istio :).
| "that can be allocated in shared memory.", | ||
| false, ENVOY_DEFAULT_MAX_STATS, "uint64_t", cmd); | ||
| TCLAP::ValueArg<uint64_t> max_stat_name_len("", "max-stat-name-len", | ||
| "Maximum name length for a stat", false, |
There was a problem hiding this comment.
I am missing something here. We are not making the cluster Name configurable (its 60). So why bother making this name configurable since the current 128byte length seems to handle all the stats? Unless you happen to have custom filters that generate stats exceeding this length.
There was a problem hiding this comment.
In a follow up change we will make all the user supplied name max lengths dynamic based on this parameter.
You should be able to do your own benchmarking with various parameters. There are known solutions for making the allocator faster, but I would like those to be done in a follow up if they are needed. |
|
I haven't benchmarked, but I'm pretty sure that changing the max stat name len will have minimal effect on performance of allocation; adding more maximum stats affects it because we have to look through every entry to see if the name exists already, so more entries == more work. It is O(max-number-stats) to allocate one stat. |
Signed-off-by: Greg Greenway <ggreenway@apple.com>
|
Filed an issue to track performance: #1824 |
mattklein123
left a comment
There was a problem hiding this comment.
looks good, small test request.
| } | ||
|
|
||
| if (max_stat_name_len.getValue() < 127) { | ||
| std::cerr << "error: the 'max-stat-name-len' value specified (" << max_stat_name_len.getValue() |
There was a problem hiding this comment.
Can you add an options_impl death test for this?
| #define ENVOY_DEFAULT_MAX_STAT_NAME_LENGTH 127 | ||
| #endif | ||
|
|
||
| #if ENVOY_DEFAULT_MAX_STAT_NAME_LENGTH < 127 |
There was a problem hiding this comment.
Don't feel that strongly about it but not sure if we need this check here, since it will fail in the options check below.
There was a problem hiding this comment.
I thought it would be annoying if you can compile a build where the default value is an error. I can take this back out though if you want.
There was a problem hiding this comment.
Nah I'm ambivalent. That's fine.
Signed-off-by: Greg Greenway <ggreenway@apple.com>
| }; | ||
| #else | ||
| const std::string shared_mem_version = "disabled"; | ||
| const Envoy::OptionsImpl::HotRestartVersionCB hot_restart_version_cb = [](uint64_t, uint64_t) { |
There was a problem hiding this comment.
This broke the MacOS build. HotRestartVersionCB -> HotRestartVersionCb
There was a problem hiding this comment.
Sorry. I'll post a fix.
Signed-off-by: JP Simard <jp@jpsim.com>
Signed-off-by: JP Simard <jp@jpsim.com>
**Description** This use debugLogEnabled flag to avoid unnecessary function/interface calls in the request handling path. --------- Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>
of a stat name, configurable at compile-time and run-time.
fixes #1761
Signed-off-by: Greg Greenway ggreenway@apple.com