Plumb header validator into HTTP/1 server codec#21974
Plumb header validator into HTTP/1 server codec#21974yanavlasov merged 11 commits intoenvoyproxy:mainfrom
Conversation
Signed-off-by: Yan Avlasov <yavlasov@google.com>
Signed-off-by: Yan Avlasov <yavlasov@google.com>
Signed-off-by: Yan Avlasov <yavlasov@google.com>
mattklein123
left a comment
There was a problem hiding this comment.
LGTM, thanks. Very excited to see this work progressing.
|
/retest |
|
Retrying Azure Pipelines: |
alyssawilk
left a comment
There was a problem hiding this comment.
Exciting to see this underway!
/wait
| const absl::string_view InvalidMethod = "http1.invalid_method"; | ||
| const absl::string_view InvalidStatus = "http1.invalid_status"; | ||
| const absl::string_view InvalidHeader = "http1.invalid_header"; | ||
| const absl::string_view InvalidHeaders = "http1.invalid_headers"; |
There was a problem hiding this comment.
is there value in disambiguating 1 from 1+?
There was a problem hiding this comment.
Removed after refactor
| @@ -507,6 +515,9 @@ Status ConnectionImpl::completeLastHeader() { | |||
| ASSERT(dispatching_); | |||
There was a problem hiding this comment.
optional as it's not your code, but consider renaming completeLastHeader as I read it as completing the last header, not completing the most recent header to be processed
There was a problem hiding this comment.
Will do in a separate PR
| // CONNECT "urls" are actually host:port so look like absolute URLs to the above checks. | ||
| // Absolute URLS in CONNECT requests will be rejected below by the URL class validation. | ||
| if (!codec_settings_.allow_absolute_url_ && !is_connect) { | ||
| RETURN_IF_ERROR(checkHeaderEntryAndSendError(path, active_request_->request_url_, |
There was a problem hiding this comment.
don't we need to know if the request is CONNECT to validate if the "path" is legal in this case?
| // request-target. A proxy that forwards such a request MUST generate a | ||
| // new Host field-value based on the received request-target rather than | ||
| // forward the received Host field-value. | ||
| RETURN_IF_ERROR(checkHeaderEntryAndSendError(Http::HeaderString(header_values.Host), |
There was a problem hiding this comment.
I can imagine these errors being pretty hard to debug without trace logs - I think previously the headers would land in the access logs, so you could debug what the invalid host was, where now we're protecting access logs and response filters from bad header data. I think the loss of debugability is probably worse though so I'd be inclined to add the header and then check validity so we can debug or come up with some other workaround
| result = codecProtocolError("http/1.1 protocol error: invalid header map"); | ||
| std::function<void(ResponseHeaderMap & headers)> modify_headers; | ||
| if (validation_result == Http::HeaderValidator::RequestHeaderMapValidationResult::Redirect && | ||
| !Grpc::Common::hasGrpcContentType(header_map)) { |
There was a problem hiding this comment.
Also I think before we mostly did checks of each header after processing all, so if we had a gRPC request and a bad header before gRPC content type, response would follow gRPC path, and now we'll send a standard error reply for headers we would have done "in envoy" (same path for prior http_parser errors). I don't think it's deal breaker but we should consider documenting this wherever we document differences between the old and the new.
It's also a lot more work to do this in all three codecs. Given we're probably going to have to have a "check these headers for errors" function for headers transformed by WASM or ext_authz and such, I wonder if we should simply do a one time pass on all headers after we read them and before the HCM processes them, rather than doing all this invasive plumbing into client and server codecs for HTTP/1, HTTP/2 and HTTP/3. We're going to have to do a pass against all headers once we have all headers to make sure we catch things like 'no content length and transfer encoding' so I wonder if we should just do character validation there and skip all the work along the way. Thoughts?
There was a problem hiding this comment.
I have moved header validation into the HCM and moved it out of the codec. I think if we need to do checks for individual headers as they are being parsed, I can plumb it back into codecs. It is however performance optimization only, that allows header parsing to be stopped early once bad header is found.
Signed-off-by: Yan Avlasov <yavlasov@google.com>
Signed-off-by: Yan Avlasov <yavlasov@google.com>
Signed-off-by: Yan Avlasov <yavlasov@google.com>
Signed-off-by: Yan Avlasov <yavlasov@google.com>
alyssawilk
left a comment
There was a problem hiding this comment.
Nice, I think this will be much simpler.
/wait
| auto response_details_opt = filter_manager_.streamInfo().responseCodeDetails(); | ||
|
|
||
| sendLocalReply(response_code, "", modify_headers, grpc_status, | ||
| response_details_opt |
There was a problem hiding this comment.
IMO details should not be optional, and we should ENVOY_BUG it's not empty.
There was a problem hiding this comment.
sorry missed this but it still looks optional. We could add details to the function signature and ENVOY_BUG that it's not empty?
| request_header_timer_.reset(); | ||
| } | ||
|
|
||
| if (!validateHeaders()) { |
There was a problem hiding this comment.
optional: move logging up
ENVOY_STREAM_LOG(debug, "request headers complete
so we get "request headers complete" then "sending local reply"
| request_header_timer_.reset(); | ||
| } | ||
|
|
||
| if (!validateHeaders()) { |
There was a problem hiding this comment.
one of the gains of moving this to the HCM is making sure we handle connection close properly for say HTTP/1.0 header only requests. I encourage best-effort checks of the shouldCloseconnection before the validation and an integration test which future-proofs since apparently nothing failed.
There was a problem hiding this comment.
You right, I missed that. Added tests
| } | ||
|
|
||
| if (!validateHeaders()) { | ||
| ENVOY_STREAM_LOG(debug, "request headers validation failed:\n{}", *this, *request_headers_); |
There was a problem hiding this comment.
encouraged follow-up (TODO?) we could have a default header validator, and move most of Envoy's current HCM checks there since they should be redundant with UHV
if (!request_headers_->Host())
if ((!HeaderUtility::isConnect(*request_headers_) || request_headers_->Path()) &&
...
There was a problem hiding this comment.
This is on my TODO list and PR with these changes will be out soon.
| return *tracing_custom_tags_; | ||
| } | ||
|
|
||
| bool validateHeaders(); |
There was a problem hiding this comment.
comment, especially as today it doesn't always validate
| [[maybe_unused]] Server::Configuration::FactoryContext& context) { | ||
|
|
||
| Http::HeaderValidatorFactoryPtr header_validator_factory; | ||
| #ifdef ENVOY_ENABLE_UHV |
There was a problem hiding this comment.
this is only ifdef'd out until we have runtime controls of underlying header validation correct?
Should we throw an exception if it's configured on but not supported due to compile time options?
| return proxy_status_config_.get(); | ||
| } | ||
| Http::HeaderValidatorPtr makeHeaderValidator(Http::Protocol, StreamInfo::StreamInfo&) override { | ||
| // TODO(yanavlasov): admin interface should use the default validator |
There was a problem hiding this comment.
should this ENVOY_BUG? seems like it's pretty unsafe to do admin requests until this is fixed
There was a problem hiding this comment.
Had to undo ENVOY_BUG as it crashed all integration tests with UHV enabled. This item is on my internal OKR tracker and will not be dropped.
Signed-off-by: Yan Avlasov <yavlasov@google.com>
Signed-off-by: Yan Avlasov <yavlasov@google.com>
alyssawilk
left a comment
There was a problem hiding this comment.
LGTM modulo one last nit and CI failure
/wait
/retest
| auto response_details_opt = filter_manager_.streamInfo().responseCodeDetails(); | ||
|
|
||
| sendLocalReply(response_code, "", modify_headers, grpc_status, | ||
| response_details_opt |
There was a problem hiding this comment.
sorry missed this but it still looks optional. We could add details to the function signature and ENVOY_BUG that it's not empty?
|
Retrying Azure Pipelines: |
alyssawilk
left a comment
There was a problem hiding this comment.
making that last comment optional :-)
Signed-off-by: Yan Avlasov <yavlasov@google.com>
Signed-off-by: Yan Avlasov <yavlasov@google.com>
|
Changed the return type of header validator to be a tuple of action and details string and enforced non empty string at construction. |
| using Action = ActionType; | ||
|
|
||
| // Helper for constructing successful results | ||
| static Result success() { return Result(ActionType::Accept, absl::string_view()); } |
There was a problem hiding this comment.
This needs a TODO for removal since it encourages empty string views, no?
There was a problem hiding this comment.
We only need details if header map was invalid, i.e. when result was either reject or redirect. If header map passed, the details are empty, which is what this method is used for.
Signed-off-by: Yan Avlasov <yavlasov@google.com> Signed-off-by: yanavlasov <yavlasov@google.com> Signed-off-by: Ryan Northey <ryan@synca.io>
Additional Description:
This PR plumbs header validator into HTTP/1 codec. The code is disabled in normal builds and is behind the ENVOY_ENABLE_UHV flag, which is tested during compile_time_options build only.
H/1 client and H/2 codecs are to be done in a follow up PR.
Risk Level: Low, no changes for normal Envoy builds.
Testing: Unit Tests
Docs Changes: N/A
Release Notes: N/A
Platform Specific Features: N/A
Signed-off-by: Yan Avlasov yavlasov@google.com