Skip to content

http1: allow configuring format of encoded http1 response headers#8499

Merged
snowp merged 25 commits intoenvoyproxy:masterfrom
snowp:h1-casing
Oct 24, 2019
Merged

http1: allow configuring format of encoded http1 response headers#8499
snowp merged 25 commits intoenvoyproxy:masterfrom
snowp:h1-casing

Conversation

@snowp
Copy link
Copy Markdown
Contributor

@snowp snowp commented Oct 4, 2019

Adds a configuration option that will convert all header keys into
Train-Case. This is useful to allow Envoy to respond with headers
that match the casing of other systems, to ensure that the wire format
of responses is unchanged when migrating to Envoy.

Signed-off-by: Snow Pettersen snowp@squareup.com

Risk Level: Mediun
Testing: UTs
Docs Changes: Proto comments
Release Notes: n/a
Fixes #8463

Adds a configuration option that will convert all header keys into
Train-Case. This is useful to allow Envoy to respond with headers
that match the casing of other systems, to ensure that the wire format
of responses is unchanged when migrating to Envoy.

Fixes envoyproxy#8463

Signed-off-by: Snow Pettersen <snowp@squareup.com>
@repokitteh-read-only
Copy link
Copy Markdown

CC @envoyproxy/api-shepherds: Your approval is needed for changes made to api/.

🐱

Caused by: #8499 was opened by snowp.

see: more, trace.

@snowp snowp changed the title WIP http1: allow configuring format of encoded response headers WIP http1: allow configuring format of encoded http1 response headers Oct 5, 2019
Signed-off-by: Snow Pettersen <snowp@squareup.com>
@snowp
Copy link
Copy Markdown
Contributor Author

snowp commented Oct 5, 2019

Mainly looking for API review at this point: is this too specific? Should we be aiming for something more flexible like arbitrary regex transformations with the ability to uppercase matches?

For example, this could be represented by a series of regex matches: { match: /^[a-z]/, uppercase: true } + { match: /[^0-9a-zA-Z][a-z]/, uppercase: true }. This would come at some perf cost and code complexity but would make this much more flexible.

Open to ideas for how to best express this so that it can be useful for as many people as possible.

@htuch @alyssawilk

Copy link
Copy Markdown
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just flushing out a drive by API comment. Nice to see this being worked on!

string default_host_for_http_10 = 3;

message HeaderKeyFormat {
enum HeaderFormat {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

enums are pretty fragile for stuff like this. I would recommend just making this a oneof with a required value and a single TrainCase message that is empty. In the future we can easily add configuration to the TrainCase message or add more complicated message types.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's probably a good idea.

@htuch
Copy link
Copy Markdown
Member

htuch commented Oct 7, 2019

@snowp would pass-thru of client casing work to support this use case? @jmarantz has done some research on this and it seems doable.

@snowp
Copy link
Copy Markdown
Contributor Author

snowp commented Oct 7, 2019

Pass-through would work in certain scenarios (h/1.1 all the way), but I had some concerns about supporting intermediate h2 hops, as we tend to use h2 wherever possible. Seems like @alyssawilk shared some of those concerns based on the conversation in #8463.

Signed-off-by: Snow Pettersen <snowp@squareup.com>
string default_host_for_http_10 = 3;

message HeaderKeyFormat {
enum HeaderFormat {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's probably a good idea.

// Formats the header using Train-Case: the first character and any character following a
// special character will be capitalized if it's an alpha character. For example,
// "content-type" becomes "Content-Type", and "foo$b#$ar" becomes "Foo$B#$Ar".
TRAIN_CASE = 1;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is "train case" a standard term? I haven't heard it before.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i hadn't heard about it either before it was used in the issue that was filed. Googling around I can find some references to it (https://docs.rs/Inflector/0.7.0/inflector/cases/traincase/index.html, https://medium.com/infancyit/letter-case-special-styles-7fa369bf173d), though other sources have it mean something else (https://en.wikipedia.org/wiki/Letter_case#Special_case_styles says TRAIN-CASE, https://www.fluentcpp.com/2019/01/08/restmycase-a-c-library-for-formatting-string-cases/ says train_Case).

Open to naming suggestions here, given that this doesn't seem to be a very common term

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One option might be rfc7230_casing since we're basically trying to present the headers the way they are cased in the HTTP/1.1 RFCs.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that sounds good to me. @PiotrSikora any concerns about taking the name of the RFC in vain?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"TrainCase" is called PascalCase (aka Upper Camel Case).

We should also have the DONT_MODIFY case (probably a good default, and in theory it's backwards compatible, since HTTP headers are case-insensitive), because not all standard headers use PascalCase (e.g. Content-MD5 and TE), and use transcoding to PascalCase or lowercase only when transforming headers between HTTP/1.1 and HTTP/2.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nevermind, PascalCase is without hyphens.

Copy link
Copy Markdown
Contributor

@PiotrSikora PiotrSikora Oct 8, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@htuch using RFC number here isn't appropriate, IMHO, because that would imply that the casing in defined in that RFC, whereas that's not the case (as far as I can tell).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about propercase_words? then we define a word as any sequence of alpha characters


// Describes how the keys for response headers should be formatted. By default, all header keys
// are lower cased.
HeaderKeyFormat response_header_key_format = 4;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this only for response? What about request headers forwarded to the upstream?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i can add it for the request path too, but that seems to be a bit more involved since the http1 codec settings aren't currently being passed through to the connection pool/upstream http/1.1 codec (since all the current options apply only to the HCM).

How about i add the proto def /w not-implemented and leave that as a TODO?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it makes sense to have it just be header_format, and keep it hidden until we have it implemented on both sides, because being allowd to set response header format where the settings are used for requests is weird.
WDYT?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah i like that, will go with that

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arguably though clients and backends might have different case senstivity concerns. E.g. one wants Rfc7230-Casing, the other wants 1337Spe4k-k451nG.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, but we have codec options in both places.
AFIK when there's protocol options on an HCM, this will affect how we serialize the response headers and when we set the protocol options in CDS it'll affect how we serialize request headers. Since we're not doing enforcement of request headers at the listener or response headers in the connection pool, I don't know what it would mean to have response headers in the connection pool Http1 options config.

I suggest we just have one name, and note it affects how we serialize the headers out at the codec.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think the fact that the configuration for request and response headers wouldn't occur in the same location (HCM vs Cluster) makes having it be a single field make more sense than two fields with only one ever being relevant nicer

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, a single name makes sense given the multiplicity of codec options on downstream/upstream.

@htuch
Copy link
Copy Markdown
Member

htuch commented Oct 8, 2019

@snowp ack, let's continue with explicit capitalization support then.

Snow Pettersen added 6 commits October 10, 2019 17:25
Signed-off-by: Snow Pettersen <snowp@squareup.com>
Signed-off-by: Snow Pettersen <snowp@squareup.com>
Signed-off-by: Snow Pettersen <snowp@squareup.com>
Signed-off-by: Snow Pettersen <snowp@squareup.com>
Signed-off-by: Snow Pettersen <snowp@squareup.com>
Signed-off-by: Snow Pettersen <snowp@squareup.com>
@snowp snowp changed the title WIP http1: allow configuring format of encoded http1 response headers http1: allow configuring format of encoded http1 response headers Oct 11, 2019
@snowp snowp marked this pull request as ready for review October 11, 2019 15:40
@snowp
Copy link
Copy Markdown
Contributor Author

snowp commented Oct 11, 2019

This should be ready for review for whoever wants to pick this up

Copy link
Copy Markdown
Contributor

@alyssawilk alyssawilk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very cool! First pass thoughts


static_cast<StreamEncoderImpl*>(context)->encodeHeader(key_to_use,
header.value().getStringView());
if (header_key_formatter_ != nullptr) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd lean towards an Encode helper function which took the context, cormatter (which could be null) and string view header key

namespace Envoy {
namespace Http {
namespace Http1 {
std::string ProperCaseHeaderKeyFormatter::format(absl::string_view key) const {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at https://en.wikipedia.org/wiki/List_of_HTTP_header_fields

This won't do A-IM, TE Content-MD5, HTTP2-Setting, and about 10 others the way an HTTP/1 user might expect.

I'd lean towards saying we do standard capitalization of standard HTTP 1.1 headers that Envoy knows about (e.g. TE) with a link to headers Envoy knows about. ProperCaseExceptForRegisteredStandardizedHeaderExceptions is too wordy, but I think that's what we want?

This way if someone later on desperately needs P3P cased they awy they want, they can just add it to the list of Envoy known headers. And when we eventually add the Envoy extension mechanism for arbitrary headers we keep talking about, folks can do their own custom builds with TheirWhackyCAPs if they want.

Alternately if we want to go for the simple case, I think we should document there are known standard headers which will still look unusual to HTTP/1 consumers, and someone can implement ProperCaseByDefaultButWithOverrides later now we have an API for things.

@mattklein123 WDYT?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe StandardOrProperCase? I can go either way on whether we want to handle well known headers, it sounds a tad bit more in line with what someone might expect but it might be overkill.

Looking at Go's net/http (https://golang.org/src/net/http/header.go#L214) they do the same kind of formatting this PR does when canonicalizing headers except that they only uppercase letters after a hyphen.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a strong opinion here as long as it is well documented. The way that we have things set up at the API layer we can always make things configurable later if needed.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A soft suggestion of having StandardOrProperCase behavior; and perhaps condensing the config name down to StandardCase (which would follow well-known casings for well-known headers, and standard casing patterns for H/1 otherwise).

Welford's
Werror
XDS
ar
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we change foo$b#$ar to foo$b#$are for all the various explantions of that transform to avoid this addition? Feels like it'd be a common typo of "are" which we should catch.

Copy link
Copy Markdown
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool stuff, a few drive by comments.

bool saw_content_length = false;
headers.iterate(
[](const HeaderEntry& header, void* context) -> HeaderMap::Iterate {
[this](const HeaderEntry& header, void* context) -> HeaderMap::Iterate {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, this function used a C-style callback originally for performance reasons when I was optimizing for the horse race case. I would leave a comment here about looking at perf, or potentially just pre-create the lambda at construction time so that in the keep alive case we don't need any extra allocations.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think it's worth having multiple versions of iterate, one with C-style callback and one with std::function? afaik std::function adds some overhead compared to C-style callbacks due to the additional indirection

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Turns out I was able to do this without changing the signature, since I could just use the encoder that was already being passed as the context to resolve the formatter.

namespace Envoy {
namespace Http {
namespace Http1 {
std::string ProperCaseHeaderKeyFormatter::format(absl::string_view key) const {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a strong opinion here as long as it is well documented. The way that we have things set up at the API layer we can always make things configurable later if needed.

oneof header_format {
// Formats the header by proper casing words: the first character and any character following
// a special character will be capitalized if it's an alpha character. For example,
// "content-type" becomes "Content-Type", and "foo$b#$ar" becomes "Foo$B#$Ar".
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@envoyproxy/api-shepherds API LGTM. I think if there are nuances around details with how this is done, e.g. whether it's - only or any punctuation, we can sort them out with ProperCaseWords fields later.

Snow Pettersen added 4 commits October 18, 2019 13:15
Signed-off-by: Snow Pettersen <snowp@squareup.com>
Signed-off-by: Snow Pettersen <snowp@squareup.com>
Signed-off-by: Snow Pettersen <snowp@squareup.com>
Signed-off-by: Snow Pettersen <snowp@squareup.com>
Snow Pettersen added 3 commits October 21, 2019 14:55
Signed-off-by: Snow Pettersen <snowp@squareup.com>
Signed-off-by: Snow Pettersen <snowp@squareup.com>
Signed-off-by: Snow Pettersen <snowp@squareup.com>
Snow Pettersen added 3 commits October 21, 2019 15:29
Signed-off-by: Snow Pettersen <snowp@squareup.com>
Signed-off-by: Snow Pettersen <snowp@squareup.com>
Signed-off-by: Snow Pettersen <snowp@squareup.com>
Copy link
Copy Markdown
Contributor

@alyssawilk alyssawilk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good! I'd be inclined to bump risk to medium as there are now some extra refactors, but your call.

// Formats the header by proper casing words: the first character and any character following
// a special character will be capitalized if it's an alpha character. For example,
// "content-type" becomes "Content-Type", and "foo$b#$are" becomes "Foo$B#$Are".
ProperCaseWords proper_case_words = 1;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optional: Do you think it's worth calling out both that this is an effort to normalize HTTP/1 headers to what users expect, and that this will not behave "as expected" for all headers, e.g. TE? I can imagine that folks who have HTTP/1 casing issues will think this is the solution to their problem and I think documenting the simplicity vs the non-handled corner cases will probably save someone some trouble.

Headers::get().TransferEncoding.get().size(),
Headers::get().TransferEncodingValues.Chunked.c_str(),
Headers::get().TransferEncodingValues.Chunked.size());
encodeFormattedHeader(Headers::get().TransferEncoding.get().c_str(),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we skip the c_str() here and in other places?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Snow Pettersen added 2 commits October 22, 2019 09:27
Signed-off-by: Snow Pettersen <snowp@squareup.com>
Signed-off-by: Snow Pettersen <snowp@squareup.com>
alyssawilk
alyssawilk previously approved these changes Oct 22, 2019
@snowp
Copy link
Copy Markdown
Contributor Author

snowp commented Oct 22, 2019

/azp run

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 3 pipeline(s).

@mattklein123
Copy link
Copy Markdown
Member

Please merge master to fix coverage.

/wait

Signed-off-by: Snow Pettersen <snowp@squareup.com>
Copy link
Copy Markdown
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, LGTM with some small comments.

/wait

runHighWatermarkCallbacks();
}

encode_header_cb_ = [](const HeaderEntry& header, void* context) -> HeaderMap::Iterate {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I think we can kill this member variable and revert this part of the change now, right?

bool is_response_to_head_request_{false};
bool is_content_length_allowed_{true};
HeaderMap::ConstIterateCb encode_header_cb_;
HeaderKeyFormatter* header_key_formatter_;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can this be const? (Both the pointer and the object)

std::string ProperCaseHeaderKeyFormatter::format(absl::string_view key) const {
auto copy = std::string(key);

bool shouldCapitalize = true;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bump

Signed-off-by: Snow Pettersen <snowp@squareup.com>
Copy link
Copy Markdown
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@snowp snowp merged commit 7846427 into envoyproxy:master Oct 24, 2019
derekargueta pushed a commit to derekargueta/envoy that referenced this pull request Oct 24, 2019
…voyproxy#8499)

Adds a configuration option that will convert all header keys into
Proper-Case. This is useful to allow Envoy to respond with headers
that match the casing of other systems, to ensure that the wire format
of responses is unchanged when migrating to Envoy.

Fixes envoyproxy#8463

Signed-off-by: Snow Pettersen <snowp@squareup.com>
@sunnoy
Copy link
Copy Markdown

sunnoy commented Mar 15, 2021

So how to configure the custom header format, for example, I want to get this format header:

Sec-WebSocket-Accept

not 

Sec-Websocket-Accept

@snowp

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

In HTTP/1.1, envoy should pass-through header capitalization unmodified (without lower-casing)

7 participants