Skip to content

xff: add support for configuring a list of trusted CIDRs#31831

Merged
wbpcode merged 51 commits intoenvoyproxy:mainfrom
jamesog:xff_trusted_cidrs
Sep 9, 2024
Merged

xff: add support for configuring a list of trusted CIDRs#31831
wbpcode merged 51 commits intoenvoyproxy:mainfrom
jamesog:xff_trusted_cidrs

Conversation

@jamesog
Copy link
Copy Markdown
Contributor

@jamesog jamesog commented Jan 15, 2024

Commit Message: xff: add support for configuring a list of trusted CIDRs

The original client IP address can be determined from the x-forwarded-for header either by a fixed number of trusted hops, or by evaluating the client IP address against a list of trusted addresses.

This adds support for configuring a list of CIDRs in the xff original IP detection extension. The remote IP address is evaluated against these, and optionally recurses through XFF to find the last non-trusted address.

Additional Description:
This feature is generally used by people with a CDN in front of their edge proxy to ensure that XFF is only parsed when the remote connection comes from a CDN server.

The behaviour of the new functionality should be the same as Nginx's realip module.

Disclaimer: This is my first time writing C++ so I'm not certain my changes are completely idiomatic, but I've tried to stick with existing style in the codebase. Feedback very welcome!

Risk Level: Medium
Testing: Unit tests, manual tests
Docs Changes: Updates to HTTP Connection Manager header manipulation docs, and proto docs.
Release Notes: Added to changelogs/current.yaml
Platform Specific Features: None
Fixes #21639
Relates to #31296

X-Forwarded-For parsing can either trust a number of hops, as Envoy does
today, or it can trust by IP address.

This enables configuration to add a set of CIDRs to the XFF config, and
a recurse option.

When recurse is not enabled, HCM should trust a remote IP that is
covered by one of the configured CIDRs and use the last IP in XFF as the
remote IP.

When recurse is enabled, if the remote IP is covered by one of the
CIDRs, HCM should also walk backwards through XFF to find the first IP
not covered by the trusted CIDR list.

Signed-off-by: James O'Gorman <james@netinertia.co.uk>
remoteAddressIsTrustedProxy checks the remote IP against a list of
CIDRs. If it is covered by any CIDR the remote is considered a trusted
proxy.

getLastNonTrustedAddressFromXFF walks backwards through XFF to find the
first IP that is not covered by a list of trusted CIDRs. This IP is
considered the "true" remote IP.

Signed-off-by: James O'Gorman <james@netinertia.co.uk>
If a list of trusted CIDRs is set, rather than checking for a set number
of trusted hops, instead check the remote IP against the trusted CIDR
list.

If recurse is set, XFF is also evaluated to find the last IP in the list
that is not covered by the trusted CIDRs.

Signed-off-by: James O'Gorman <james@netinertia.co.uk>
Signed-off-by: James O'Gorman <james@netinertia.co.uk>
Signed-off-by: James O'Gorman <james@netinertia.co.uk>
@repokitteh-read-only
Copy link
Copy Markdown

Hi @jamesog, welcome and thank you for your contribution.

We will try to review your Pull Request as quickly as possible.

In the meantime, please take a look at the contribution guidelines if you have not done so already.

🐱

Caused by: #31831 was opened by jamesog.

see: more, trace.

@repokitteh-read-only
Copy link
Copy Markdown

CC @envoyproxy/api-shepherds: Your approval is needed for changes made to (api/envoy/|docs/root/api-docs/).
envoyproxy/api-shepherds assignee is @markdroth
CC @envoyproxy/api-watchers: FYI only for changes made to (api/envoy/|docs/root/api-docs/).

🐱

Caused by: #31831 was opened by jamesog.

see: more, trace.

Signed-off-by: James O'Gorman <james@netinertia.co.uk>
Signed-off-by: James O'Gorman <james@netinertia.co.uk>
Override xff_num_trusted_hops in the constructor instead of validating
in the proto message when xff_trusted_cidrs is set.

Signed-off-by: James O'Gorman <james@netinertia.co.uk>
Signed-off-by: James O'Gorman <james@netinertia.co.uk>
Signed-off-by: James O'Gorman <james@netinertia.co.uk>
Signed-off-by: James O'Gorman <james@netinertia.co.uk>
Signed-off-by: James O'Gorman <james@netinertia.co.uk>
Signed-off-by: James O'Gorman <james@netinertia.co.uk>
@markdroth
Copy link
Copy Markdown
Contributor

/lgtm api

Copy link
Copy Markdown
Contributor

@alyssawilk alyssawilk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So sorry for the review delay. Thanks for the well documented and tested first pass - my thoughts below.
/wait

// is not specified. See the documentation for
// :ref:`config_http_conn_man_headers_x-forwarded-for` for more information.
//
// If ``xff_trusted_cidrs`` is configured, this field is not used.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we disallow it being set and reject config if it is?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I originally had this change as using a oneof encapsulating the original xff_num_trusted_hops and the new xff_trusted_cidrs fields, but Mark told me that would be a breaking API change and to do it this way instead. Do you know if there's another way to handle rejecting it without it being a breaking change?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we want to disallow if both the old and new fields are set, since that makes it hard for control planes to start using this new feature.

If we reject when both fields are set, then a control plane must know whether any given client supports the new field. If it sets only the new field, old clients will not work properly, and if it sets both, new clients will break. So the only way to transition from the old field to the new field is to either (a) wait for all clients to be upgraded before starting to use the new field or (b) have some way for the control plane to know which client supports the new field and which clients support the old field so that it can send the right config to each one -- and doing that in a way that does not break cacheability requires some dances with dynamic parameter support, which means that the clients have to be configured to tell the control plane whether or not they support the new field.

In contrast, if we allow both fields to be set and specify the precedence order, then this becomes much easier. Control planes can simply set both fields. Old clients will use the old field, and new clients will prefer the new field.

I think the latter approach is much cleaner and should be preferred any time we add this kind of new field in xDS.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do you feel about the change I made since Alyssa's feedback? d976354

She suggested it should either log or reject; I chose to reject by throwing an error with a message that clearly informs that only one field can be set.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I'm saying is that I don't think that's the behavior we want. I think the way you had this originally was the right thing to do.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case I'm getting conflicting advice from two different maintainers :-)

@alyssawilk perhaps you can help decide which way I should take this based on Mark's feedback?

//
// If ``recurse`` is set the :ref:`config_http_conn_man_headers_x-forwarded-for`
// header is also evaluated against the trusted CIDR list, and the last non-trusted
// address is used as the original client address.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ooc when would we not want to recurse?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a good question! I added some commentary to the docs in 1f76584.

}
return {addr, true};
}
return {nullptr, false};
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if every address in the XFF is trusted don't we want to return the last address in the list?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one made me think! Originally I was assuming this might be some kind of undefined behaviour and so it was safer to return null, but after talking about this with a colleague, and looking over what other proxy servers do, I implemented your suggestion in a016a6c. Thanks for questioning my logic :-)

When parsing XFF using a list of trusted CIDRs, if all IP in XFF
match the trusted CIDRs (there are no untrusted IPs), return the first
IP in XFF (the last evaluated) - and a test to confirm the behaviour.

Signed-off-by: James O'Gorman <james@netinertia.co.uk>
Signed-off-by: James O'Gorman <james@netinertia.co.uk>
Signed-off-by: James O'Gorman <james@netinertia.co.uk>
@wbpcode
Copy link
Copy Markdown
Member

wbpcode commented Aug 21, 2024

/wait

Signed-off-by: James O'Gorman <james@netinertia.co.uk>
Signed-off-by: James O'Gorman <james@netinertia.co.uk>
@jamesog
Copy link
Copy Markdown
Contributor Author

jamesog commented Aug 23, 2024

/retest

@wbpcode
Copy link
Copy Markdown
Member

wbpcode commented Aug 31, 2024

Sorry for the delay. Some times I may ignore the notification from tools. It's OK to ping me directly on slack.

@wbpcode
Copy link
Copy Markdown
Member

wbpcode commented Aug 31, 2024

/wait

The API should provide a consistent default value, and we cannot set
this to false without causing a breaking change. Instead we have a
warning in the API docs recommending to disable this option.

Signed-off-by: James O'Gorman <james@netinertia.co.uk>
Signed-off-by: James O'Gorman <james@netinertia.co.uk>
This reverts c69df1c and changes to use
throwEnvoyExceptionOrPanic, which will cause a config error in Envoy,
and a panic in Envoy Mobile.

Signed-off-by: James O'Gorman <james@netinertia.co.uk>
Signed-off-by: James O'Gorman <james@netinertia.co.uk>
Signed-off-by: James O'Gorman <james@netinertia.co.uk>
Copy link
Copy Markdown
Member

@wbpcode wbpcode left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the great contribution. Only last comment, I think.

Comment on lines +760 to +761
bool Utility::remoteAddressIsTrustedProxy(
const Envoy::Network::Address::InstanceConstSharedPtr remote,
Copy link
Copy Markdown
Member

@wbpcode wbpcode Sep 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See, basically we only use the smart point as input parameter when we wants to change the ownership. But in this case, if we ensure this remote always be valid, then const Envoy::Network::Address::Instance& should be used.

If we can ensure this remote always be valid, then OptRef<const Envoy::Network::Address::Instance> should be used.

I think there should be the first case.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/wait

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I understood you correctly. This is getting beyond my knowledge here so I wasn't quite sure. Please see the latest commit that now uses const Network::Address:Instance&.

Signed-off-by: James O'Gorman <james@netinertia.co.uk>
Copy link
Copy Markdown
Member

@wbpcode wbpcode left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Configurable filter of trusted proxies to handle x-forwarded-for header

8 participants