Implement downstream IP hashing for HTTP ketama routing#1727
Implement downstream IP hashing for HTTP ketama routing#1727mattklein123 merged 11 commits intoenvoyproxy:masterfrom
Conversation
The downstream address information is necessary for source-IP-based hashing. Signed-off-by: Alex Konradi <akonradi@google.com>
Add support for hashing the downstream IP address for use in HTTP ketama routing. This complements the existing scheme of using one of the HTTP headers as a hash value. Signed-off-by: Alex Konradi <akonradi@google.com>
|
This code implements the desired functionality, and includes unit tests. An integration test might also be a good idea but I don't know how to implement it without depending on the choice of hashing scheme. For instance, it would be easy to check that requests from a downstream client C get routed to the same upstream host X, but if another upstream host Y joins, depending on where it ends up being placed in the hash ring, it might start receiving requests from C. Thoughts? |
|
@akonradi I think unit test only is fine for this change. (Haven't reviewed yet). |
htuch
left a comment
There was a problem hiding this comment.
Looks good. I agree that we don't need an integration test for this (and adding one will be difficult, since we don't have a naive way of having multiple source addresses in IPv6-only setups).
include/envoy/router/router.h
Outdated
| * exist. | ||
| */ | ||
| virtual Optional<uint64_t> generateHash(const Http::HeaderMap& headers) const PURE; | ||
| virtual Optional<uint64_t> generateHash(const Network::Address::Instance* downstream_address, |
There was a problem hiding this comment.
Can you add Doxygen comments for @param?
source/common/router/config_impl.cc
Outdated
| if (header) { | ||
| hash.value(HashUtil::xxHash64(header->value().c_str())); | ||
| } | ||
| } else if (hash_ip_ && downstream_addr) { |
There was a problem hiding this comment.
What about the idea of these being cumulative? I.e. I can make the hash a function of both header and downstream IP rather than either/or? I think this was the idea of making the policy a repeated.
There was a problem hiding this comment.
Yeah this is the one part of this change that I find a little strange.
- Cumulative also makes sense to me.
- It seems odd to trigger header matching based on the empty string. When we translate the repeated is there a way to translate in such a way that we don't need to check on empty string and perhaps just translate to generic interface with an evaluate() function like we do in a few other places? (Not sure of details here but just a thought).
There was a problem hiding this comment.
That's #1422, though it wouldn't be too bad to implement as part of this PR.
source/common/router/router.h
Outdated
| auto hash_policy = route_entry_->hashPolicy(); | ||
| if (hash_policy) { | ||
| return hash_policy->generateHash(*downstream_headers_); | ||
| const Network::Connection* connection = this->downstreamConnection(); |
There was a problem hiding this comment.
In what situations do we now have a downstreamConnection()? Should this a be an ASSERT and/or reference instead?
There was a problem hiding this comment.
IIRC AsyncClient does not have a downstream connection. This actually confused me also and I had to think about it so I think worth a comment.
There was a problem hiding this comment.
also nit: in general we don't use this-> in this type of context.
| TEST(RouterMatcherTest, HashPolicyIp) { | ||
| NiceMock<Runtime::MockLoader> runtime; | ||
| NiceMock<Upstream::MockClusterManager> cm; | ||
| auto routeConfig = baseHashPolicyRouteConfig(); |
There was a problem hiding this comment.
I'd probably use a test fixture here.
| } | ||
| { | ||
| Http::TestHeaderMapImpl headers = genHeaders("www.lyft.com", "/foo", "GET"); | ||
| Network::Address::Ipv4Instance addr("1.2.3.4"); |
There was a problem hiding this comment.
What about a test to show that different IPs hash to different addresses?
Bugfix: the previous version of this code used the address of the directly connected downstream host. This is not the same as the address of the client making the request unless the client is directly connected to the Envoy instance doing the routing. To fix this, use the downstream address of the routed stream instead. Signed-off-by: Alex Konradi <akonradi@google.com>
Signed-off-by: Alex Konradi <akonradi@google.com>
Signed-off-by: Alex Konradi <akonradi@google.com>
Signed-off-by: Alex Konradi <akonradi@google.com>
htuch
left a comment
There was a problem hiding this comment.
LGTM, just some one more thing..
include/envoy/router/router.h
Outdated
| * A hash value might not be returned if for example the specified HTTP header does not | ||
| * exist. In the future we might add additional support for hashing on origin address, | ||
| * etc. | ||
| * @param const downstream_address the address of the connected client host |
There was a problem hiding this comment.
@param should have the name of the parameter, not the type.
| { | ||
| Http::TestHeaderMapImpl headers = genHeaders("www.lyft.com", "/foo", "GET"); | ||
| const auto hash_policy = config.route(headers, 0)->routeEntry()->hashPolicy(); | ||
| uint64_t hash_1 = hash_policy->generateHash("1.2.3.4", headers).value(); |
| } | ||
|
|
||
| Optional<uint64_t> HashPolicyImpl::generateHash(const Http::HeaderMap& headers) const { | ||
| Optional<uint64_t> HashPolicyImpl::generateHash(const std::string& downstream_addr, |
There was a problem hiding this comment.
According to @mattklein123:
IIRC AsyncClient does not have a downstream connection. This actually confused me also and I had to think about it so I think worth a comment.
Should we at least add a check that a downstream IP based route config is not used alongside this?
There was a problem hiding this comment.
I don't understand your question, but a comment was added to the docs for generateHash that addresses @mattklein123's comment. downstream_addr is an empty string for requests originated by AsyncClient.
There was a problem hiding this comment.
OK, as long as it's empty it's fine.
There was a problem hiding this comment.
OK cool. I wasn't sure. If it's empty string that's fine.
Signed-off-by: Alex Konradi <akonradi@google.com>
Signed-off-by: Alex Konradi <akonradi@google.com>
Signed-off-by: Alex Konradi <akonradi@google.com>
include/envoy/router/router.h
Outdated
| * @param const headers the HTTP headers for the stream | ||
| * @return Optional<uint64_t> an optional hash value to route on. A hash value might not be | ||
| * returned if for example the specified HTTP header does not exist. | ||
| * @return Optional<uint64_t> an optional hash value to route on. |
There was a problem hiding this comment.
Nit: reorder @return after @param.
include/envoy/router/router.h
Outdated
| * returned if for example the specified HTTP header does not exist. | ||
| * @return Optional<uint64_t> an optional hash value to route on. | ||
| * @param downstream_address contains the address of the connected client host, or an | ||
| * empty string if the request is initiated from within this host, and |
There was a problem hiding this comment.
Nit: no need for , and between @param
| const Http::HeaderEntry* header = headers.get(header_name_); | ||
| if (header) { | ||
| hash.value(HashUtil::xxHash64(header->value().c_str())); | ||
| if (!header_name_.get().empty()) { |
There was a problem hiding this comment.
What about the comment thread here re: pivoting on empty string and hashing on multiple things? Did we decide to defer that to a follow up?
There was a problem hiding this comment.
Yup, that'll be my next PR.
…-hashing Signed-off-by: Alex Konradi <akonradi@google.com>
Signed-off-by: Alex Konradi <akonradi@google.com>
|
The coverage test failed because the JVM bugged out, lovely. |
|
@akonradi I kicked it for you |
) The v2 API supports using the downstream (source) IP address to route HTTP requests to upstream hosts. This PR implements that functionality, as specified in envoyproxy#1296. Signed-off-by: Alex Konradi <akonradi@google.com>
Signed-off-by: Jose Nino <jnino@lyft.com> Signed-off-by: JP Simard <jp@jpsim.com>
Signed-off-by: Jose Nino <jnino@lyft.com> Signed-off-by: JP Simard <jp@jpsim.com>
**Description** The reason for the 401 response could also be token is missing in the request. Signed-off-by: Huabing Zhao <zhaohuabing@gmail.com>
The v2 API supports using the downstream (source) IP address to route HTTP requests to upstream hosts. This PR implements that functionality, as specified in #1296.