lb: gentle failover across load balancing priority levels.#2290
lb: gentle failover across load balancing priority levels.#2290alyssawilk merged 16 commits intoenvoyproxy:masterfrom
Conversation
Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
mattklein123
left a comment
There was a problem hiding this comment.
Very cool stuff. Some comments to get started. I haven't looked at the tests yet.
| if (!host_set->healthyHosts().empty()) { | ||
| return host_set.get(); | ||
| } | ||
| uint32_t sumEntries(std::vector<uint32_t>& vector) { |
There was a problem hiding this comment.
nit: const std::vector<uint32_t>&
| } | ||
| } | ||
| // The percentages should always add up to 100 but we have to have a return for the compiler. | ||
| ASSERT(0); |
There was a problem hiding this comment.
nit: I would use NOT_REACHED here.
|
|
||
| // Determine the health of the newly modified priority level. | ||
| // Health ranges from 0-100, and is the ratio of healthy hosts to total hosts, modified by the | ||
| // overprovision factor of 1.4 |
There was a problem hiding this comment.
Can you add some more detail on where 1.4 comes from? Also, nit, period end of sentence.
There was a problem hiding this comment.
From a decade working somewhere where we over-provision roughly 20%. I can make it configurable from the get go but I figured we could leave it until someone actually cared.
I'll comment both that it's arbitrary and can eventually be configurable unless you'd prefer I make it a config option now.
| // First, determine if the load needs to be scaled relative to health. For example if there are | ||
| // 3 host sets with 20% / 20% / 10% health they will get 40% / 40% / 20% load to ensure total load | ||
| // adds up to 100. | ||
| uint32_t total_health = std::min<uint32_t>(sumEntries(per_priority_health_), 100); |
| } | ||
|
|
||
| const HostSet& LoadBalancerBase::chooseHostSet() { | ||
| uint32_t priority = choosePriority(random_.random(), per_priority_load_); |
| } | ||
| const uint64_t h = hash.valid() ? hash.value() : random_.random(); | ||
|
|
||
| uint32_t priority = LoadBalancerBase::choosePriority(h, per_priority_load_); |
| stats_.lb_healthy_panic_.inc(); | ||
| } | ||
| return ring_->chooseHost(context, random_); | ||
| RingConstSharedPtr ring = per_priority_state_[priority]->current_ring_; |
There was a problem hiding this comment.
the compiler will probably optimize this, but I would avoid the local variable so we don't inc/decref the shared pointer.
| RingConstSharedPtr ring_to_use; | ||
| bool global_panic_to_use; | ||
| { | ||
| std::shared_lock<std::shared_timed_mutex> lock(mutex_); |
There was a problem hiding this comment.
I might be missing something here but I don't think we can lose the locking?
| std::unique_lock<std::shared_timed_mutex> lock(factory_->mutex_); | ||
| factory_->current_ring_ = new_ring; | ||
| factory_->global_panic_ = new_global_panic; | ||
| std::unique_lock<std::shared_timed_mutex> lock(factory_->mutex_); |
There was a problem hiding this comment.
This happens on the main thread. You don't need the read lock here I don't think, you just need the write lock below when you swap it all in. You still need the read lock on the factory create call though.
There was a problem hiding this comment.
Above was a clear error, but this one confuses me.
You were locking factory_->mutex before changing factory->current_ring and factory->global_panic. Why then don't I need to snag the same lock when changing factory->per_priority_state?
There was a problem hiding this comment.
Sorry I read this too quickly. The issue here I think is that you might have inconsistent state since you release the lock several times during this code. I think the correct way to do this is:
- Do the first loop to generate state, etc.
- Acquire lock
- Do 2nd loop to store state.
| Runtime::RandomGenerator& random_; | ||
| const RingConstSharedPtr ring_; | ||
| const bool global_panic_; | ||
| std::vector<PerPriorityStatePtr> per_priority_state_; |
There was a problem hiding this comment.
Instead of copying the vectors can we just use shared_ptr to constant vector? This will make lock hold time smaller also. (I think per other comments the locking is not quite right unless I am missing something).
There was a problem hiding this comment.
I can but it's a perf hit for everything else since they then can't do in-place edits.
If we're worried about lock time how about a shared pointer for the ring hash where we refresh, which can be shared by all the threads?
There was a problem hiding this comment.
nit: I would either do shared_ptr for both per_priority_state_ and per_priority_load_, or for neither. It's fine with me either way (doing it out of lock with swap is fine). Any reason not to be consistent?
| sum += entry; | ||
| } | ||
| return priority_set->hostSetsPerPriority()[0].get(); | ||
| return sum; |
There was a problem hiding this comment.
Nit: could use http://en.cppreference.com/w/cpp/algorithm/accumulate
Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
| std::unique_lock<std::shared_timed_mutex> lock(factory_->mutex_); | ||
| factory_->current_ring_ = new_ring; | ||
| factory_->global_panic_ = new_global_panic; | ||
| std::unique_lock<std::shared_timed_mutex> lock(factory_->mutex_); |
There was a problem hiding this comment.
Sorry I read this too quickly. The issue here I think is that you might have inconsistent state since you release the lock several times during this code. I think the correct way to do this is:
- Do the first loop to generate state, etc.
- Acquire lock
- Do 2nd loop to store state.
|
|
||
| std::shared_lock<std::shared_timed_mutex> lock(mutex_); | ||
| for (size_t i = 0; i < per_priority_state_.size(); ++i) { | ||
| lb->per_priority_state_.push_back(PerPriorityStatePtr{new PerPriorityState}); |
There was a problem hiding this comment.
Can you build an allocate the vector outside of the lock? Then you can just acquire the lock and copy the data values in.
Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
mattklein123
left a comment
There was a problem hiding this comment.
Looks solid. Thanks for all the locking changes. A few more random comments. Also I think the OSX build failure is legit, can you take a look?
| return priority_set->hostSetsPerPriority()[0].get(); | ||
| // The percentages should always add up to 100 but we have to have a return for the compiler. | ||
| NOT_REACHED; | ||
| return 0; |
There was a problem hiding this comment.
nit: I don't think the compiler will require the return if you have NOT_REACHED.
| Runtime::RandomGenerator& random_; | ||
| const RingConstSharedPtr ring_; | ||
| const bool global_panic_; | ||
| std::vector<PerPriorityStatePtr> per_priority_state_; |
There was a problem hiding this comment.
nit: I would either do shared_ptr for both per_priority_state_ and per_priority_load_, or for neither. It's fine with me either way (doing it out of lock with swap is fine). Any reason not to be consistent?
…ters Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
ea6d818 to
b37b396
Compare
mattklein123
left a comment
There was a problem hiding this comment.
LGTM other than small nit and OSX breakage.
| } | ||
| const uint64_t h = hash.valid() ? hash.value() : random_.random(); | ||
|
|
||
| uint32_t priority = LoadBalancerBase::choosePriority(h, *per_priority_load_); |
Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
Changes Envoy load balancing across priority levels from a hard failover to trickling data based on the health percentage of each priority level.
Risk Level: Medium
Testing:
Added thorough unit testing for the lb failover code, as well as fixing up ring hash failover testing and adding ring-hash specific tests.
Docs Changes:
envoyproxy/data-plane-api#359
Release Notes: n/a: falls under existing note
[Optional Fixes #Issue]
Fixes #1929