Skip to content

proposal: net/http: normalize comma-separated headers #62471

@CAFxX

Description

@CAFxX

Consider the following example:

    // normally the headers arrive from untrusted sources, this is just for exposition
	h := http.Header{}
	h.Add("Accept", "bar, baz")
	h.Add("Accept", "foo")

	// Intuitively this should print: []string{"bar", "baz", "foo"}
	// instead of:                    []string{"bar, baz", "foo"}
	fmt.Printf("%#v\n", h.Values("Accept"))

I am not sure we can or want to modify the behaviour of Values but I would argue we should still offer a way to get a "normalized" list of values - as in any case if someone really receives a request with those headers they will definitely want to perform that normalization. This is especially important as not doing it (as the current functions "lead" to do) may lead to header-confusion issues.

Just to define "normalization", I am thinking of something like this:

for _, s := range h.Values(key) {
	for _, v := range strings.Split(s, ",") {
		values = append(values, strings.Trim(v, " \t"))
	}
}

that, AFAICT, is compliant with RFC 7230.

Assuming we don't want to change the behavior of the existing functions, this could be exposed (the name is a strawman) as

func (h Header) ValuesParsed(k string) (values []string) {
	for _, s := range h.Values(k) {
		for _, v := range strings.Split(s, ",") {
			values = append(values, strings.Trim(v, " \t"))
		}
	}
	return
}

(a proper implementation will likely attempt to minimize overhead, so possibly something closer to this example)

As a side note, it's a bit unfortunate that Values is already taken, as RFC 7230 uses the notation #(values) explicitly to mean a comma-separated list of values (that is semantically equivalent to having those values, in order, spread across multiple headers with the same name).

A sender MUST NOT generate multiple header fields with the same field
name in a message unless either the entire field value for that
header field is defined as a comma-separated list [i.e., #(values)]
or the header field is a well-known exception (as noted below).

A recipient MAY combine multiple header fields with the same field
name into one "field-name: field-value" pair, without changing the
semantics of the message, by appending each subsequent field value to
the combined field value in order, separated by a comma. The order
in which header fields with the same field name are received is
therefore significant to the interpretation of the combined field
value; a proxy MUST NOT change the order of these field values when
forwarding a message.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    Status

    Incoming

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions