Skip to content

Make the total number of stats in shared memory, and the maximum length#1817

Merged
mattklein123 merged 5 commits intoenvoyproxy:masterfrom
ggreenway:stats-variable-size
Oct 6, 2017
Merged

Make the total number of stats in shared memory, and the maximum length#1817
mattklein123 merged 5 commits intoenvoyproxy:masterfrom
ggreenway:stats-variable-size

Conversation

@ggreenway
Copy link
Copy Markdown
Member

of a stat name, configurable at compile-time and run-time.

fixes #1761

Signed-off-by: Greg Greenway ggreenway@apple.com

of a stat name, configurable at compile-time and run-time.

fixes envoyproxy#1761

Signed-off-by: Greg Greenway <ggreenway@apple.com>
@hennna
Copy link
Copy Markdown
Contributor

hennna commented Oct 5, 2017

This may be out of the scope of this PR, but with this change, we can also vary the maximum cluster name length:

"maxLength" : 60

@ggreenway
Copy link
Copy Markdown
Member Author

I'm not opposed to fixing that as well; how much will it upset things to make the json schema non-constant? It might be best to make that a separate PR if it requires much work. But the biggest reason for making the name length configurable is to allow longer cluster names, so it's certainly not unrelated.

Signed-off-by: Greg Greenway <ggreenway@apple.com>
@mattklein123
Copy link
Copy Markdown
Member

@ggreenway +1 making the size of cluster name dynamic in code. However I would recommend we do it in a follow up. In actuality, we need to dynamically limit the size of all user supplied things (listener name, route table name, etc.) so I think it would be great to do that in an isolated follow up.

Copy link
Copy Markdown
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great. Some random comments. Also you probably saw but the coverage test is failing.


void RawStatData::configure(Server::Options& options) {
const size_t configured = options.maxStatNameLength();
RELEASE_ASSERT(configured > 0);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we gracefully fail this in options parsing that this is maybe >= the current default? Otherwise I think usability will be very bad.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

RawStatData* HeapRawStatDataAllocator::alloc(const std::string& name) {
RawStatData* data = new RawStatData();
memset(data, 0, sizeof(RawStatData));
RawStatData* data = static_cast<RawStatData*>(::calloc(RawStatData::size(), 1));
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: any reason to not just use malloc?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It needs to be zero-initialized. I've always used calloc for that, even when I don't need it to multiply by an element-count.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah OK. I would just add comment about zero initialize.

#ifdef ENVOY_HOT_RESTART
// Enabled by default, except on OS X. Control with "bazel --define=hot_restart=disabled"
const std::string shared_mem_version = Envoy::Server::SharedMemory::version();
const Envoy::OptionsImpl::HotRestartVersionCB hot_restart_version_cb =
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: HotRestartVersionCb

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or just auto, it should be inferable from its use in the call below.

}

sockaddr_un HotRestartImpl::createDomainSocketAddress(uint64_t id) {
// Right now we only allow a maximum of 3 concurrent envoy processes to be running. When the third
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any reason to reformat these comments?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think clang-format hit this; not sure why it got reformatted. I'll revert.

// Right now we only allow a maximum of 3 concurrent envoy processes to be running. When the third
// stats up it will kill the oldest parent.
// Right now we only allow a maximum of 3 concurrent envoy processes to be running. When the
// third stats up it will kill the oldest parent.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While you are here: s/stats/starts

// 3) P2 proceeds to the parent admin shutdown phase.
// 4) This races with P1 fetching parent stats from P0.
// 5) Calling receiveTypedRpc() below picks up the wrong message.
// There exists a race condition during hot restart involving fetching parent stats. It looks
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same can we not reformat unless necessary?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, clang-format. I think I'm on the exact same version as in the docker image; has this file not been touched since clang-format-3.6 was used?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all the files get checked, so some other mismatch maybe?

#include "tclap/CmdLine.h"

#ifndef ENVOY_DEFAULT_MAX_STATS
#define ENVOY_DEFAULT_MAX_STATS 16384
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: C++ const class values for both of these.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was so that it can be overridden at compile-time. Do you want this define, and then a const class value using these #defines for the actual value?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah OK. Please just add comment that this is for compile time override. We will need some docs on this in the bazel/README.md file.


RawStatData* alloc(const std::string& name) override {
std::unique_ptr<RawStatData>& stat_ref = stats_[name];
CSmartPtr<RawStatData, freeAdapter>& stat_ref = stats_[name];
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can't just pass :free directly here?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It wouldn't compile because ::free takes a void*, CSmartPtr wants a function that takes RawStatData*

@ggreenway
Copy link
Copy Markdown
Member Author

Performance note: I did some quick tests on how long it took to allocate all the stats at various sizes. At the default 16k it was 1 second or so, at 100k it was 120 seconds. So as is, it will be painful to increase the number of stats too much beyond the default.

Copy link
Copy Markdown
Member

@htuch htuch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this, it will be useful to us at Google as well.

// Round val up to the next multiple of the natural alignment.
// Note: this implementation only works because 8 is a power of 2.
size_t roundUpMultipleNaturalAlignment(size_t val) {
const size_t multiple = alignof(RawStatData);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add an ASSERT here that multiple is 8 or some other power-of-two. It's probably true for every architecture/compiler, but I have a slight preference for documentation via assertions.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with documentation by assertion 👍

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

return roundUpMultipleNaturalAlignment(sizeof(RawStatData) + nameSize());
}

size_t& RawStatData::getMaxNameLength(size_t configured_size) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: it's a bit hard to parse this method name. It would at first appear to be a getter but it has side effects.

#ifdef ENVOY_HOT_RESTART
// Enabled by default, except on OS X. Control with "bazel --define=hot_restart=disabled"
const std::string shared_mem_version = Envoy::Server::SharedMemory::version();
const Envoy::OptionsImpl::HotRestartVersionCB hot_restart_version_cb =
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or just auto, it should be inferable from its use in the call below.

} // namespace

size_t RawStatData::size() {
return roundUpMultipleNaturalAlignment(sizeof(RawStatData) + nameSize());
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you document the rationale for natural alignment for size? AFAICT, this has to do with how stats are allocated in hot restart shmem, where they are packed together. This isn't obvious when reading the code in the file though.

Copy link
Copy Markdown
Member

@rshriram rshriram left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One request: would it be possible to benchmark the startup time with 128 characters as the cluster name length, 256 char length for entire stat name, and retaining the existing 16K stats?

If that works, it would really cleanup the stats story for istio :).

"that can be allocated in shared memory.",
false, ENVOY_DEFAULT_MAX_STATS, "uint64_t", cmd);
TCLAP::ValueArg<uint64_t> max_stat_name_len("", "max-stat-name-len",
"Maximum name length for a stat", false,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am missing something here. We are not making the cluster Name configurable (its 60). So why bother making this name configurable since the current 128byte length seems to handle all the stats? Unless you happen to have custom filters that generate stats exceeding this length.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a follow up change we will make all the user supplied name max lengths dynamic based on this parameter.

@mattklein123
Copy link
Copy Markdown
Member

One request: would it be possible to benchmark the startup time with 128 characters as the cluster name length, 256 char length for entire stat name, and retaining the existing 16K stats?

You should be able to do your own benchmarking with various parameters. There are known solutions for making the allocator faster, but I would like those to be done in a follow up if they are needed.

@ggreenway
Copy link
Copy Markdown
Member Author

I haven't benchmarked, but I'm pretty sure that changing the max stat name len will have minimal effect on performance of allocation; adding more maximum stats affects it because we have to look through every entry to see if the name exists already, so more entries == more work. It is O(max-number-stats) to allocate one stat.

Signed-off-by: Greg Greenway <ggreenway@apple.com>
@ggreenway
Copy link
Copy Markdown
Member Author

Filed an issue to track performance: #1824

htuch
htuch previously approved these changes Oct 6, 2017
Copy link
Copy Markdown
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good, small test request.

}

if (max_stat_name_len.getValue() < 127) {
std::cerr << "error: the 'max-stat-name-len' value specified (" << max_stat_name_len.getValue()
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add an options_impl death test for this?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test added

#define ENVOY_DEFAULT_MAX_STAT_NAME_LENGTH 127
#endif

#if ENVOY_DEFAULT_MAX_STAT_NAME_LENGTH < 127
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't feel that strongly about it but not sure if we need this check here, since it will fail in the options check below.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought it would be annoying if you can compile a build where the default value is an error. I can take this back out though if you want.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nah I'm ambivalent. That's fine.

Signed-off-by: Greg Greenway <ggreenway@apple.com>
@mattklein123 mattklein123 merged commit 547bad4 into envoyproxy:master Oct 6, 2017
@ggreenway ggreenway deleted the stats-variable-size branch October 6, 2017 22:31
};
#else
const std::string shared_mem_version = "disabled";
const Envoy::OptionsImpl::HotRestartVersionCB hot_restart_version_cb = [](uint64_t, uint64_t) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This broke the MacOS build. HotRestartVersionCB -> HotRestartVersionCb

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry. I'll post a fix.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

jpsim pushed a commit that referenced this pull request Nov 28, 2022
Signed-off-by: JP Simard <jp@jpsim.com>
jpsim pushed a commit that referenced this pull request Nov 29, 2022
Signed-off-by: JP Simard <jp@jpsim.com>
mathetake added a commit that referenced this pull request Mar 3, 2026
**Description**

This use debugLogEnabled flag to avoid unnecessary function/interface
calls in the request handling path.

---------

Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Number of stats and length of stat name should be configurable

6 participants