Skip to content

listeners: filter chain matches for source_type. Partially fixes #4535#4682

Merged
htuch merged 27 commits intoenvoyproxy:masterfrom
nickolaev:fix_4535
Nov 27, 2018
Merged

listeners: filter chain matches for source_type. Partially fixes #4535#4682
htuch merged 27 commits intoenvoyproxy:masterfrom
nickolaev:fix_4535

Conversation

@nickolaev
Copy link
Copy Markdown

Description:
Add filter chain match for source_type. Possulbe options are
ANY_HOST(default)
SAME_HOST
EXTERNAL_HOST

This allows for explicitly specifying local connectivity detection, which is needed in specific use cases.

Risk Level: Low
Testing:
Unit tests.
Docs Changes:
Inline proto comments
Release Notes:
Optional Fixes #4535 (partially)

@nickolaev nickolaev force-pushed the fix_4535 branch 2 times, most recently from 9f47b72 to 1adef6b Compare October 10, 2018 21:54
Copy link
Copy Markdown
Member

@rshriram rshriram left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a brittle model. Local and non local traffic can be distinguished using a simple not condition in the source match and matching against 127.*..

It’s also unclear how this will interact when the source ip match is implemented.

@rshriram
Copy link
Copy Markdown
Member

May be I am missing something elementary here.
I have an application that binds itself to 127.255.0.1. And an Envoy listening on 10.0.0.1 on the same machine. if this app calls Envoy, will the socket's remote address be 127.255.0.1 or will it be 10.0.0.1? If the kernel is going to automatically set the socket's remote address to 10.0.0.1, then what you have is great.

What happens on machines that have multiple network interfaces, and envoy is bound to 0.0.0.0:80 and caller binds to one of network interfaces on the machine? Same as above?

What happens if envoy binds to one of the network interfaces (say 10.0.0.1) and the caller binds itself to another interface (eth2 on 11.11.11.11) on same machine, and makes the call?

@rshriram
Copy link
Copy Markdown
Member

This feature will work perfectly fine with the source ip match when implemented. It will be able to narrow down on the SAME_HOST and EXTERNAL_HOST. It will will allow to distinguish for example (SAME_HOST on 10.0.0.1) from (EXTERNAL_HOST on 10.0.0.1) from (SAME_HOST on 127.**)

Can you give an example?
If I say source type same host and source IP as 192.158.1.1 (an external host), will it not match/will it match?
If I say source type is external host and source IP as 127.0.0.1, what happens?

And if you see where I am going, my point was not about making the code work. The code can always be made to work :). My point was about causing additional confusion to end users, with two conflicting configurations (same host in type and a random IP in source IP).
So at the very least, make it a oneof between the source type and the source IP.

@htuch
Copy link
Copy Markdown
Member

htuch commented Oct 11, 2018

Yeah, I guess at the very least we might want same subnet semantics etc. Couldn't we create a bunch of FilterChains with source==dest IP ranges to achieve this today?

@georgi-d
Copy link
Copy Markdown
Contributor

@rshriram

May be I am missing something elementary here.
I have an application that binds itself to 127.255.0.1. And an Envoy listening on 10.0.0.1 on the same machine. if this app calls Envoy, will the socket's remote address be 127.255.0.1 or will it be 10.0.0.1? If the kernel is going to automatically set the socket's remote address to 10.0.0.1, then what you have is great.

If a client specifically binds to one network interface address and connects to Envoy bound on a different interface then SAME_HOST check will not catch it but I have almost never seen clients explicitly binding to an interface before connection. And this would mean that the client has explicitly bound to the wrong interface. The connection would need to be routed in this case.

What happens on machines that have multiple network interfaces, and envoy is bound to 0.0.0.0:80 and caller binds to one of network interfaces on the machine? Same as above?

Same as above this means the client has explicitly bound to the wrong interface before opening the connection. The network stack auto bind would always bind to the same interface and IP as the destination.

What happens if envoy binds to one of the network interfaces (say 10.0.0.1) and the caller binds itself to another interface (eth2 on 11.11.11.11) on same machine, and makes the call?

Same as above. All the examples cover the case where the client has explicitly bound to an interface/IP before connecting which almost never happen in the TCP world. This might happen over UDP.

All the above cases can be covered when the source IP match is added on top of this functionality. But they need to be handled explicitly.

This functionality cover the common dynamic case where the client resolves the destination IP using a discovery service (DNS or similar) and then connects to the IP normally leaving the OS to auto bind the client side of the socket. If the destination service happens to be running on the same machine as the client then the SAME_HOST check will be true and a different routing config can be applied. In our case we want such connections to remain over HTTP instead of redirecting them to HTTPS.

@georgi-d
Copy link
Copy Markdown
Contributor

This feature will work perfectly fine with the source ip match when implemented. It will be able to narrow down on the SAME_HOST and EXTERNAL_HOST. It will will allow to distinguish for example (SAME_HOST on 10.0.0.1) from (EXTERNAL_HOST on 10.0.0.1) from (SAME_HOST on 127.**)

Can you give an example?
If I say source type same host and source IP as 192.158.1.1 (an external host), will it not match/will it match?

SAME_HOST matches if the source and destination addresses are the same which can happen only if the client is on the same machine. If you specify SAME_HOST and source_ip which does not exist on the machine then the match will always fail. Which is expected and should not be confusing.

If I say source type is external host and source IP as 127.0.0.1, what happens?

If you specify EXTERNAL_HOST and source IP 127.0.0.1 the match will succeed for the case when the client has bound explicitly to 127.0.0.1 and connected to the 10.0.0.1 local address. (one of the cases you mentioned in the other comment)

And if you see where I am going, my point was not about making the code work. The code can always be made to work :). My point was about causing additional confusion to end users, with two conflicting configurations (same host in type and a random IP in source IP).
So at the very least, make it a oneof between the source type and the source IP.

If we make it oneof then the case just above will not be possible to express. When the client has bound to 127.0.0.1 and connected to 10.0.0.1. I do not feel to strongly about this case but it might be better to have it for completeness.

If you feel it would be less confusing I am fine with renaming SAME_HOST to something else (SAME_INTERFACE). We might need to have longer explanation on when this would be useful.

@georgi-d
Copy link
Copy Markdown
Contributor

@htuch

Yeah, I guess at the very least we might want same subnet semantics etc. Couldn't we create a bunch of FilterChains with source==dest IP ranges to achieve this today?

We cannot cover the case with a bunch of source == dest IP ranges because in most cases you do not know all the interfaces and IP addresses on the machine and how they can change over time. And there is no reliable way to monitor IP and interface changes on the machine.

@rshriram
Copy link
Copy Markdown
Member

May be change this to same_interface or same_network_namespace, external_network, any_network ? Host implies to me that it’s from same machine which will be confusing when used in non vm scenarios like kubernetes.

With all the nice responses you provided above (specifically the bit about auto binding).

@louiscryan with the same_host thing we can get rid of the annoying container port thing in istio as we can restrict all outbound listeners in a pod to same_host (ie same network namespace, app to Envoy) and inbound listeners to external host. Cc @PiotrSikora

@nickolaev nickolaev force-pushed the fix_4535 branch 2 times, most recently from 7a803b0 to fbdfe31 Compare October 15, 2018 22:29
Copy link
Copy Markdown
Member

@htuch htuch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, let's move forward with this.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you elaborate on how this works with source_prefix_ranges (and source_ports if we decided to keep it, see #4457.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the source in this PR #4466, I'm trying to figure out how these two can merge. On one hand, it makes sense to move source_type match in findFilterChain as the PR does for the source_port match. But on the other, this approach suggests that you have either destination (preferred) or source matching but not both. Would it make sense if source_prefix_ranges match is implemented in SourceTypesArray? Something like:

typedef std::array<SourceIPsTriePtr, 3> SourceTypesArray;
typedef std::unordered_map<std::string, SourceTypesArray> ApplicationProtocolsMap;
...

typedef std::unordered_map<uint16_t, Network::FilterChainSharedPtr> SourcePortsMap;
typedef Network::LcTrie::LcTrie<SourcePortsMap> SourceIPsTriePtr;
typedef std::unique_ptr<SourceIPsTrie> SourceIPsTriePtr;

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@PiotrSikora might want to weigh in, as he was originally defining the filter chain match semantics. I think you want to have source type as higher precedence than source CIDR range and that we can drop source port entirely. It also might make sense to give destination higher precedence than source CIDR range, since VIP and VIP ranges are the usual high order bit when reasoning about services.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nickolaev the changes in #4466 are wrong, since intermixing source and destination matching doesn't follow any rules. Source IPs and ports should be matched last (see #4457).

Due to the way the filter chain matching is implemented (i.e. matches on higher levels of the preference tree are binding even if there are no matches on the lower levels), I think that source_type should be the very last matching criteria, so that it doesn't take priority over source IP and/or port matching, although I imagine that people will use one or the other, so the ordering between source IP/port/type shouldn't matter too much.

dmitankin and others added 2 commits October 19, 2018 11:30
Signed-off-by: Nikolay Nikolaev <nnikolay@vmware.com>
Partially implements envoyproxy#4535. It also extends MockConnectionSocket
to allow SAME_HOST matching.

Signed-off-by: Nikolay Nikolaev <nnikolay@vmware.com>
}

bool Utility::isLocalConnection(const Address::Instance& local_address,
const Address::Instance& remote_address) {
Copy link
Copy Markdown
Contributor

@PiotrSikora PiotrSikora Oct 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't at all check whether the connection is local or not, it checks if the IP address is the same in both arguments. You should pass const Network::ConnectionSocket& socket as the only argument and verify that IP address is the same in socket.localAddress() and socket.remoteAddress().

Having said that, this check works only for one particular scenario (connection terminated at the local proxy), and doesn't work if the outgoing connection is bound to an interface and/or an IP address, nor when the connection is intercepted using iptables.

You should really check if the IP address from socket.remoteAddress() matches IP address of one of local network interfaces.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@PiotrSikora You are correct that the name of the method is not reflecting what it actually does.
The closest to the truth would be to rename the config option to source_destination_match with possible enum values {SAME, DISTINCT, ANY} which would cover a lot of the use cases and be easily implemented in platform independent way

You should really check if the IP address from socket.remoteAddress() matches IP address of one of local network interfaces.

This cannot be implemented in platform independent way. I found how it can be implemented on linux:
https://stackoverflow.com/questions/1160963/how-to-enumerate-all-ip-addresses-attached-to-a-machine-in-posix-c/1161018#1161018

There is a sample code which would need to be extended to support IPv6 if we want to go that way. This option can really be called isLocalConnecton. I am not sure what are the performance characteristics of this ioctl call.

I am fine with both solutions as well as having both of them together.

Thoughts?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The closest to the truth would be to rename the config option to source_destination_match with possible enum values {SAME, DISTINCT, ANY} which would cover a lot of the use cases and be easily implemented in platform independent way

But then you're limiting this feature by implementation and not by intent.

Most importantly, the current implementation doesn't work with iptables interception, which is a pretty big deal-breaker, IMHO.

This cannot be implemented in platform independent way. I found how it can be implemented on linux:
https://stackoverflow.com/questions/1160963/how-to-enumerate-all-ip-addresses-attached-to-a-machine-in-posix-c/1161018#1161018

That response is from 2009.

This can be easily done in a semi-portable way using getifaddrs, which we already use in Envoy (see: Network::Utility::getLocalAddress).

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@PiotrSikora I agree that it is better not to limit by implementation and have the more general solution.

Lets switch to the implementation as in Network::Utility::getLocalAddress
Then we can rename the configuration parameter to source_type : {LOCAL, EXTERNAL, ANY}

Do we have any concerns about the performance of calling getifaddrs on every accepted connection?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know the cost of getifaddrs(), so it's hard to tell, but we could always cache the results for X seconds if it turns out to be expensive.

In any case, we should only call it if either LOCAL or EXTERNAL are configured, and provide fast-path that doesn't do the check if only ANY (default) is configured.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@PiotrSikora

I agree with the potential optimization but for our use cаse we will always be in the bad case.

There is one further optimization which is when src == dst we know it has to be local so we can do the interface search only when src != dst and LOCAL or EXTERNAL is configured.

I am strongly against caching this information because it would lead to random failures which are very hard to debug. I would rather have both solutions in and clearly explain the difference in functionality and performance. One is source to destination matching and the other is matching source against the list of local interfaces.

I am fine though with implementing the interface matching first and we can benchmark the performance impact of it. Then we can decide if there is a need for the `source == destination mаtching option.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nickolaev the changes in #4466 are wrong, since intermixing source and destination matching doesn't follow any rules. Source IPs and ports should be matched last (see #4457).

Due to the way the filter chain matching is implemented (i.e. matches on higher levels of the preference tree are binding even if there are no matches on the lower levels), I think that source_type should be the very last matching criteria, so that it doesn't take priority over source IP and/or port matching, although I imagine that people will use one or the other, so the ordering between source IP/port/type shouldn't matter too much.

Nikolay Nikolaev added 2 commits October 30, 2018 09:59
@stale
Copy link
Copy Markdown

stale bot commented Nov 6, 2018

This pull request has been automatically marked as stale because it has not had activity in the last 7 days. It will be closed in 7 days if no further activity occurs. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions!

@stale stale bot added the stale stalebot believes this issue/PR has not been touched recently label Nov 6, 2018
@cmluciano
Copy link
Copy Markdown
Member

@nickolaev Are you still working on this and do you plan to also support the change for source_ips to work ala #4457

@stale stale bot removed the stale stalebot believes this issue/PR has not been touched recently label Nov 6, 2018
Copy link
Copy Markdown
Contributor

@PiotrSikora PiotrSikora left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a few places where you're passing expect_source_type_match=true where it should have been false (when filter chain selection hits empty branch and doesn't even reach source type rules).

Looks great otherwise, thanks!

// IPv4 client connects to unknown port - no match.
auto filter_chain = findFilterChain(1234, false, "127.0.0.1", false, "", false, "tls", false, {});
auto filter_chain = findFilterChain(1234, false, "127.0.0.1", false, "", false, "tls", false, {},
false, "8.8.8.8", false, true);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

expect_source_type_match should be false here.

// UDS client - no match.
filter_chain = findFilterChain(0, false, "/tmp/test.sock", false, "", false, "tls", false, {});
filter_chain = findFilterChain(0, false, "/tmp/test.sock", false, "", false, "tls", false, {},
false, "/tmp/test.sock", false, true);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

expect_source_type_match should be false here.

// IPv4 client connects to unknown IP - no match.
auto filter_chain = findFilterChain(1234, true, "1.2.3.4", false, "", false, "tls", false, {});
auto filter_chain = findFilterChain(1234, true, "1.2.3.4", false, "", false, "tls", false, {},
false, "8.8.8.8", false, true);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

expect_source_type_match should be false here.

// UDS client - no match.
filter_chain = findFilterChain(0, true, "/tmp/test.sock", false, "", false, "tls", false, {});
filter_chain = findFilterChain(0, true, "/tmp/test.sock", false, "", false, "tls", false, {},
false, "/tmp/test.sock", false, true);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

expect_source_type_match should be false here.

// TLS client without SNI - no match.
auto filter_chain = findFilterChain(1234, true, "127.0.0.1", true, "", false, "tls", false, {});
auto filter_chain = findFilterChain(1234, true, "127.0.0.1", true, "", false, "tls", false, {},
false, "8.8.8.8", false, true);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

expect_source_type_match should be false here.

filter_chain =
findFilterChain(1234, true, "127.0.0.1", true, "www.example.com", false, "tls", false, {});
filter_chain = findFilterChain(1234, true, "127.0.0.1", true, "www.example.com", false, "tls",
false, {}, false, "8.8.8.8", false, true);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

expect_source_type_match should be false here.

auto filter_chain =
findFilterChain(1234, true, "127.0.0.1", true, "", true, "raw_buffer", false, {});
auto filter_chain = findFilterChain(1234, true, "127.0.0.1", true, "", true, "raw_buffer", false,
{}, false, "8.8.8.8", false, true);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

expect_source_type_match should be false here.

// TLS client without ALPN - no match.
auto filter_chain = findFilterChain(1234, true, "127.0.0.1", true, "", true, "tls", true, {});
auto filter_chain = findFilterChain(1234, true, "127.0.0.1", true, "", true, "tls", true, {},
false, "8.8.8.8", false, true);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

expect_source_type_match should be false here.

filter_chain =
findFilterChain(1234, true, "127.0.0.1", true, "server1.example.com", true, "tls", true, {});
filter_chain = findFilterChain(1234, true, "127.0.0.1", true, "server1.example.com", true, "tls",
true, {}, false, "127.0.0.1", false, true);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

expect_source_type_match should be false here.

@nickolaev
Copy link
Copy Markdown
Author

@PiotrSikora thanks, all fixed.

Copy link
Copy Markdown
Contributor

@PiotrSikora PiotrSikora left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

Pinging @envoyproxy/maintainers for review from someone can approve it for real.

Copy link
Copy Markdown
Member

@htuch htuch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thanks @nickolaev for the contribution and working with us to get it to meet the wider Envoy community needs. Also, thanks @PiotrSikora for the awesome guidance here and reviews.

@PiotrSikora
Copy link
Copy Markdown
Contributor

@htuch can we merge this or are we waiting for another review?

@htuch htuch merged commit 2c764e7 into envoyproxy:master Nov 27, 2018
htuch added a commit to htuch/envoy that referenced this pull request Nov 27, 2018
Merge of stale envoyproxy#4682 caused this.

Signed-off-by: Harvey Tuch <htuch@google.com>
@htuch htuch mentioned this pull request Nov 27, 2018
htuch added a commit that referenced this pull request Nov 27, 2018
Merge of stale #4682 caused this.

Signed-off-by: Harvey Tuch <htuch@google.com>
jrajahalme added a commit to jrajahalme/envoy that referenced this pull request Dec 20, 2018
Add test case for LocalConnection test where local and remote
addresses have the same IP address. Fix the test code to actually use
the intended local address. Fix the utility implementation to compare
the actual addresses rather than the pointers to them.

Fixes: envoyproxy#4682
Signed-off-by: Jarno Rajahalme <jarno@covalent.io>
jrajahalme added a commit to jrajahalme/envoy that referenced this pull request Dec 20, 2018
Address comparisons in Utility::isLocalConnection() should be
performed on the IP addresses, ignoring any port numbers.  Skip the
temporary Instance in the comparison loop.  Add test cases with port
numbers.

Fixes: envoyproxy#4682
Signed-off-by: Jarno Rajahalme <jarno@covalent.io>
fredlas pushed a commit to fredlas/envoy that referenced this pull request Mar 5, 2019
…yproxy#4535 (envoyproxy#4682)

Add filter chain match for source_type. Possible options are

ANY (default)
LOCAL
EXTERNAL

This allows for explicitly specifying local connectivity detection, which is needed in specific use cases.

Risk Level: Low
Docs Changes: Inline proto comments

Related to envoyproxy#4535.

Signed-off-by: Nikolay Nikolaev <nnikolay@vmware.com>
Signed-off-by: Fred Douglas <fredlas@google.com>
fredlas pushed a commit to fredlas/envoy that referenced this pull request Mar 5, 2019
Merge of stale envoyproxy#4682 caused this.

Signed-off-by: Harvey Tuch <htuch@google.com>
Signed-off-by: Fred Douglas <fredlas@google.com>
@nickolaev nickolaev deleted the fix_4535 branch March 22, 2021 08:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants