Skip to content

Plumb header validator into HTTP/1 server codec#21974

Merged
yanavlasov merged 11 commits intoenvoyproxy:mainfrom
yanavlasov:uhv-hcm-plumbing
Jul 28, 2022
Merged

Plumb header validator into HTTP/1 server codec#21974
yanavlasov merged 11 commits intoenvoyproxy:mainfrom
yanavlasov:uhv-hcm-plumbing

Conversation

@yanavlasov
Copy link
Copy Markdown
Contributor

Additional Description:
This PR plumbs header validator into HTTP/1 codec. The code is disabled in normal builds and is behind the ENVOY_ENABLE_UHV flag, which is tested during compile_time_options build only.

H/1 client and H/2 codecs are to be done in a follow up PR.

Risk Level: Low, no changes for normal Envoy builds.
Testing: Unit Tests
Docs Changes: N/A
Release Notes: N/A
Platform Specific Features: N/A

Signed-off-by: Yan Avlasov yavlasov@google.com

Signed-off-by: Yan Avlasov <yavlasov@google.com>
Signed-off-by: Yan Avlasov <yavlasov@google.com>
Signed-off-by: Yan Avlasov <yavlasov@google.com>
mattklein123
mattklein123 previously approved these changes Jul 1, 2022
Copy link
Copy Markdown
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks. Very excited to see this work progressing.

@yanavlasov
Copy link
Copy Markdown
Contributor Author

/retest

@repokitteh-read-only
Copy link
Copy Markdown

Retrying Azure Pipelines:
Retried failed jobs in: envoy-presubmit

🐱

Caused by: a #21974 (comment) was created by @yanavlasov.

see: more, trace.

ggreenway
ggreenway previously approved these changes Jul 5, 2022
Copy link
Copy Markdown
Member

@ggreenway ggreenway left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Copy Markdown
Contributor

@alyssawilk alyssawilk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exciting to see this underway!
/wait

const absl::string_view InvalidMethod = "http1.invalid_method";
const absl::string_view InvalidStatus = "http1.invalid_status";
const absl::string_view InvalidHeader = "http1.invalid_header";
const absl::string_view InvalidHeaders = "http1.invalid_headers";
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there value in disambiguating 1 from 1+?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed after refactor

@@ -507,6 +515,9 @@ Status ConnectionImpl::completeLastHeader() {
ASSERT(dispatching_);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

optional as it's not your code, but consider renaming completeLastHeader as I read it as completing the last header, not completing the most recent header to be processed

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do in a separate PR

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in PR #22408

// CONNECT "urls" are actually host:port so look like absolute URLs to the above checks.
// Absolute URLS in CONNECT requests will be rejected below by the URL class validation.
if (!codec_settings_.allow_absolute_url_ && !is_connect) {
RETURN_IF_ERROR(checkHeaderEntryAndSendError(path, active_request_->request_url_,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't we need to know if the request is CONNECT to validate if the "path" is legal in this case?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactored

// request-target. A proxy that forwards such a request MUST generate a
// new Host field-value based on the received request-target rather than
// forward the received Host field-value.
RETURN_IF_ERROR(checkHeaderEntryAndSendError(Http::HeaderString(header_values.Host),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can imagine these errors being pretty hard to debug without trace logs - I think previously the headers would land in the access logs, so you could debug what the invalid host was, where now we're protecting access logs and response filters from bad header data. I think the loss of debugability is probably worse though so I'd be inclined to add the header and then check validity so we can debug or come up with some other workaround

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactored

result = codecProtocolError("http/1.1 protocol error: invalid header map");
std::function<void(ResponseHeaderMap & headers)> modify_headers;
if (validation_result == Http::HeaderValidator::RequestHeaderMapValidationResult::Redirect &&
!Grpc::Common::hasGrpcContentType(header_map)) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also I think before we mostly did checks of each header after processing all, so if we had a gRPC request and a bad header before gRPC content type, response would follow gRPC path, and now we'll send a standard error reply for headers we would have done "in envoy" (same path for prior http_parser errors). I don't think it's deal breaker but we should consider documenting this wherever we document differences between the old and the new.

It's also a lot more work to do this in all three codecs. Given we're probably going to have to have a "check these headers for errors" function for headers transformed by WASM or ext_authz and such, I wonder if we should simply do a one time pass on all headers after we read them and before the HCM processes them, rather than doing all this invasive plumbing into client and server codecs for HTTP/1, HTTP/2 and HTTP/3. We're going to have to do a pass against all headers once we have all headers to make sure we catch things like 'no content length and transfer encoding' so I wonder if we should just do character validation there and skip all the work along the way. Thoughts?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have moved header validation into the HCM and moved it out of the codec. I think if we need to do checks for individual headers as they are being parsed, I can plumb it back into codecs. It is however performance optimization only, that allows header parsing to be stopped early once bad header is found.

Signed-off-by: Yan Avlasov <yavlasov@google.com>
@yanavlasov yanavlasov dismissed stale reviews from ggreenway and mattklein123 via 24982fd July 25, 2022 20:15
Signed-off-by: Yan Avlasov <yavlasov@google.com>
Signed-off-by: Yan Avlasov <yavlasov@google.com>
Signed-off-by: Yan Avlasov <yavlasov@google.com>
Copy link
Copy Markdown
Contributor

@alyssawilk alyssawilk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, I think this will be much simpler.
/wait

auto response_details_opt = filter_manager_.streamInfo().responseCodeDetails();

sendLocalReply(response_code, "", modify_headers, grpc_status,
response_details_opt
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO details should not be optional, and we should ENVOY_BUG it's not empty.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry missed this but it still looks optional. We could add details to the function signature and ENVOY_BUG that it's not empty?

request_header_timer_.reset();
}

if (!validateHeaders()) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

optional: move logging up
ENVOY_STREAM_LOG(debug, "request headers complete
so we get "request headers complete" then "sending local reply"

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

request_header_timer_.reset();
}

if (!validateHeaders()) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one of the gains of moving this to the HCM is making sure we handle connection close properly for say HTTP/1.0 header only requests. I encourage best-effort checks of the shouldCloseconnection before the validation and an integration test which future-proofs since apparently nothing failed.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You right, I missed that. Added tests

}

if (!validateHeaders()) {
ENVOY_STREAM_LOG(debug, "request headers validation failed:\n{}", *this, *request_headers_);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

encouraged follow-up (TODO?) we could have a default header validator, and move most of Envoy's current HCM checks there since they should be redundant with UHV
if (!request_headers_->Host())
if ((!HeaderUtility::isConnect(*request_headers_) || request_headers_->Path()) &&
...

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is on my TODO list and PR with these changes will be out soon.

return *tracing_custom_tags_;
}

bool validateHeaders();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment, especially as today it doesn't always validate

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

[[maybe_unused]] Server::Configuration::FactoryContext& context) {

Http::HeaderValidatorFactoryPtr header_validator_factory;
#ifdef ENVOY_ENABLE_UHV
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is only ifdef'd out until we have runtime controls of underlying header validation correct?

Should we throw an exception if it's configured on but not supported due to compile time options?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

return proxy_status_config_.get();
}
Http::HeaderValidatorPtr makeHeaderValidator(Http::Protocol, StreamInfo::StreamInfo&) override {
// TODO(yanavlasov): admin interface should use the default validator
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this ENVOY_BUG? seems like it's pretty unsafe to do admin requests until this is fixed

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had to undo ENVOY_BUG as it crashed all integration tests with UHV enabled. This item is on my internal OKR tracker and will not be dropped.

Signed-off-by: Yan Avlasov <yavlasov@google.com>
Signed-off-by: Yan Avlasov <yavlasov@google.com>
Copy link
Copy Markdown
Contributor

@alyssawilk alyssawilk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM modulo one last nit and CI failure
/wait
/retest

auto response_details_opt = filter_manager_.streamInfo().responseCodeDetails();

sendLocalReply(response_code, "", modify_headers, grpc_status,
response_details_opt
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry missed this but it still looks optional. We could add details to the function signature and ENVOY_BUG that it's not empty?

@repokitteh-read-only
Copy link
Copy Markdown

Retrying Azure Pipelines:
Retried failed jobs in: envoy-presubmit

🐱

Caused by: a #21974 (review) was submitted by @alyssawilk.

see: more, trace.

alyssawilk
alyssawilk previously approved these changes Jul 27, 2022
Copy link
Copy Markdown
Contributor

@alyssawilk alyssawilk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

making that last comment optional :-)

Signed-off-by: Yan Avlasov <yavlasov@google.com>
Signed-off-by: Yan Avlasov <yavlasov@google.com>
@yanavlasov
Copy link
Copy Markdown
Contributor Author

Changed the return type of header validator to be a tuple of action and details string and enforced non empty string at construction.

using Action = ActionType;

// Helper for constructing successful results
static Result success() { return Result(ActionType::Accept, absl::string_view()); }
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs a TODO for removal since it encourages empty string views, no?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We only need details if header map was invalid, i.e. when result was either reject or redirect. If header map passed, the details are empty, which is what this method is used for.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ugh sorry yes, my bad.

@yanavlasov yanavlasov merged commit e2c2a77 into envoyproxy:main Jul 28, 2022
@yanavlasov yanavlasov deleted the uhv-hcm-plumbing branch July 28, 2022 14:44
phlax pushed a commit to phlax/envoy that referenced this pull request Jun 26, 2023
Signed-off-by: Yan Avlasov <yavlasov@google.com>

Signed-off-by: yanavlasov <yavlasov@google.com>

Signed-off-by: Ryan Northey <ryan@synca.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants