Skip to content

Add a proto definition for Access Logs.#58

Merged
htuch merged 5 commits intoenvoyproxy:stagingfrom
dnoe:accesslog
May 25, 2017
Merged

Add a proto definition for Access Logs.#58
htuch merged 5 commits intoenvoyproxy:stagingfrom
dnoe:accesslog

Conversation

@dnoe
Copy link
Copy Markdown
Contributor

@dnoe dnoe commented May 19, 2017

This commit proposes a proto definition for access logs. It's still a rough
cut and documentation needs to be improved, please don't merge yet. Comments
welcome.

This commit proposes a proto definition for access logs.  It's still a rough
cut and documentation needs to be improved, please don't merge yet.  Comments
welcome.
@kyessenov
Copy link
Copy Markdown
Contributor

cc @douglas-reid since Istio Mixer has some proto defined for the access log

@kyessenov
Copy link
Copy Markdown
Contributor

What would be the guideline to extend the access log with additional request metadata? I'm thinking of a filter that identifies API method from Swagger and assigns operation_id to each HTTP request.

Copy link
Copy Markdown
Member

@htuch htuch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great in general.

// Original Path (X-Envoy-Original-Path)
string original_path = 24;

// Opaque metadata associated with this access log entry.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd change the wording to say something like "Metadata in the envoy.http_connection_manager.access_log" filter namespace.

//
// From the X-Envoy-Upstream-Service-Time response header, this is the amount it took
// the upstream server to service the request.
google.protobuf.UInt64Value upstream_service_time = 23;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

google.protobuf.Duration

google.protobuf.Int64Value request_hdrs_size = 10;

// Size of the HTTP response headers in bytes
google.protobuf.Int64Value response_hdrs_size = 11;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're generally using unsigned in most places for values that can't be negative.

Copy link
Copy Markdown
Contributor

@kyessenov kyessenov May 19, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious: why do we encourage unsigned ints in protos? In Istio, @wora explicitly banned unsigned ints from Istio config (istio/api#28)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know the details, but IIRC there is some perf benefit of doing signed. I don't know what the benefit is. IMO unless it's extreme it's much better to use unsigned for things that can't be negative.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd need to be convinced of the performance benefits. I don't buy the arguments in istio/api#28 that people can't be trusted to use signed/unsigned properly.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These shouldn't be negative so I can change it. Use of signed for this was based on an internal Google proto, where to my understanding unsigned has been discouraged in the past for reasons I don't fully grok.

// processing a request.
enum ResponseFlag {
// Local server healthcheck failed.
failed_local_healthCheck = 0;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Upper case enum constants for consistency with other enum values.

string request_method = 1;

// The time we started servicing this request
google.protobuf.UInt64Value start_time = 2;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a point time version of google.protobuf.Duration we can use here?

string user_agent = 15;

// Host header as sent by client HTTP
string host = 16;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: would probably use authority here

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah. I see you have authority below. Do we need both? In practice, Envoy translates Host -> :authority, so they will never be different.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a separate authority field. OK to combine?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I would combine into just authority.

google.protobuf.Struct metadata = 25;

// Headers requested for logging but not covered by the specific fields above.
// TODO: The value should be repeated string, but it doesn't work.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would just do a message that contains a name/value pair, then repeat that.

@htuch
Copy link
Copy Markdown
Member

htuch commented May 19, 2017

@kyessenov I think with the new metadata structure that's built up by listener/filter chain/router, the metadata struct will be passed to the filter. Initially I was thinking this would be just as additional input (a form of late configuration), but it could also be modified, with new key/values added for example.

@kyessenov
Copy link
Copy Markdown
Contributor

I think we should not conflate the config metadata with the request path metadata. They are fundamentally not the same thing, and sharing the two is going to be confusing. I'm not sure how easy it would be to simply extend the proto with new fields, or even if it is better than adding an uninterpreted struct field to the proto for filters to populate.

@htuch
Copy link
Copy Markdown
Member

htuch commented May 19, 2017

Forget I said "late configuration". The idea is we have some metadata associated with the request. This can come from multiple sources, from metadata we pick up via route selection, or from the filters themselves. We want to log both. I'm open to the consensus from folks on whether we treat the metadata we discover via listener/filter chain/route selection in a distinct bucket to the metadata produced by the filters themselves.

@kyessenov
Copy link
Copy Markdown
Contributor

I think you're raising an important point there about being able to audit the route selected for a request. Maybe we need an issue for that? IMHO it's quite difficult to debug why proxy makes particular routing choices from the access log right now.

@htuch
Copy link
Copy Markdown
Member

htuch commented May 19, 2017

@kyessenov I've filed https://github.com/lyft/envoy-api/issues/59 to track. We can continue the conversation there. For the purpose of this review, I think we can just have the metadata be from the route selection as that's what's in the API today, we can extend later as we solidify the design in #59. Thanks for the suggestion.

Copy link
Copy Markdown
Member

@htuch htuch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM modulo comments.


message EnvoyAccessLog {
// The request method, ie, "GET", "HEAD", "POST"
string request_method = 1;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would an enum be useful here?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

enum sounds good, BUT, there is an internal proto which uses string for request_method. Let's discuss offline.

api/base.proto Outdated

// Should the value be appended? If false (default), the value overrides
// existing values [V2-API-DIFF].
google.protobuf.BoolValue append = 3;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eek, I forgot I made that change recently. Also, the above one about -/empty was also a recent one. Might be best not share the definition in light of this.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can refactor this to share the definition of a HeaderValue consisting only of Key+Value pair, then create another message (HeaderValueConfig?) that include the HeaderValue and adds the append boolean.

Copy link
Copy Markdown
Contributor

@douglas-reid douglas-reid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kyessenov thanks for the heads up.

This attempts to specify access logs at a much different level than the generic log descriptor Istio currently is employing. This is closer to what I might expect to be in actual aspect rule for a subject at the moment in Istio.

It'd be interesting to discuss whether these fields could be auto-packaged as attributes and sent across for access log generation on the Mixer side, as well as whether or not Istio should support an envoy access logs aspect directly (and expose a ReportAccessLog-like API).

BTW, I couldn't help myself with some nits on the actual content. Mostly they are just about including units in field names. Hope they are helpful.

string destination_host = 7;

// Size of the HTTP request body in bytes
google.protobuf.UInt64Value request_body_size = 8;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: request_body_bytes

google.protobuf.UInt64Value request_body_size = 8;

// Size of the HTTP response body in bytes
google.protobuf.UInt64Value response_body_size = 9;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: response_body_bytes

google.protobuf.UInt64Value response_body_size = 9;

// Size of the HTTP request headers in bytes
google.protobuf.UInt64Value request_hdrs_size = 10;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: request_hdrs_bytes

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fun fact: you can't actually call this request_headers_size, because of some collision with proto generated code around the request_headers map (protoc doesn't give an error, but generates noncompilable code). If it's called request_headers_bytes, no problem! So making this change also allows the slightly less awkward hdrs -> headers change - thanks.

google.protobuf.UInt64Value request_hdrs_size = 10;

// Size of the HTTP response headers in bytes
google.protobuf.UInt64Value response_hdrs_size = 11;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: response_hdrs_bytes

// User agent as sent by client HTTP
string user_agent = 15;

// Path
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might be worth adding an example or something here that does more than repeat the field name.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once we're happy with the fields I'm planning to go back and expand the documentation for each before this gets merged.

//
// The total duration it took to service this request from the StartTime until
// the response was written to the user.
google.protobuf.Duration duration = 22;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: i'd be tempted to call this something more descriptive, like: response_duration.

//
// From the X-Envoy-Upstream-Service-Time response header, this is the amount it took
// the upstream server to service the request.
google.protobuf.Duration upstream_service_time = 23;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: upstream_service_duration (as this is a duration, not a time).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comes directly from the X-ENVOY-UPSTREAM-SERVICE-TIME response header so that's why "time" was used. I agree that the word duration makes more sense - it's definitely a duration - so I think I will make this change.


// The remote (upstream) host, as an IPv4 or IPv6 string.
// IPv6 addresses should be stored in canonical (compressed) format.
string upstream_host = 5;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

worth appending "ip" to name? upstream_host_ip might be clearer.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will mirror UPSTREAM_HOST as described in the Envoy doc (https://lyft.github.io/envoy/docs/configuration/http_conn_man/access_log.html) which includes a port. The doc comment needs updating. In light of this I think "ip" is not desirable since it also includes port.


// This field is the IP and port on which the request from the user was
// received, stored in ipv4:port or [ipv6]:port format.
string destination_host = 7;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

worth appending "ip" to name? destination_host_ip might be clearer.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also will include the port so I'd rather not call it _ip.

string request_method = 1;

// The time we started servicing this request
google.protobuf.Timestamp start_time = 2;
Copy link
Copy Markdown
Contributor

@douglas-reid douglas-reid May 20, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i know that duration is included later, but is it worth it to have end_time or equivalent as well for completeness?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the end time can be easily calculated by adding the response duration to the start time I don't think I'd make it a separate field.

@htuch
Copy link
Copy Markdown
Member

htuch commented May 22, 2017

@douglas-reid Do you have some TL;DR pointers on what this Istio aspect logging stuff is?

@htuch
Copy link
Copy Markdown
Member

htuch commented May 22, 2017

@dnoe I just realized that due to the way in which approvals for merge work, while @mattklein123 is on vacation we're going to need to switch to using the non-master branch for ongoing work. Once you're ready to merge, you can switch the merge base in the UI to staging. Thanks.

@douglas-reid
Copy link
Copy Markdown
Contributor

@htuch Sure. A lot of it is in flux at the moment, but here's my short(ish) summary of the current state:

  • In Istio-land, Envoy sends a set of "Report" attributes to Istio Mixer for every request
  • Based on operator-supplied configuration, Mixer turns those attributes into data structures for processing by (pluggable) backends (say, sending logs data to aws cloudwatch, for example)
  • The supported data structure types are, essentially, the aspects.
  • There is an access logs aspect currently defined in Mixer than consists of 3 basic pieces:
    1. Name for the log stream
    2. A template to use to apply to set of attributes to generate a text string (ex: apache common logging template)
    3. A set of labels (map from string to string) to generate from attribute values (for use in structured logging)

IIUC, this envoy access log could be configured (on the Istio side) by:

  1. ensuring that the full set of desired data elements were passed as attributes to Mixer
  2. creating Mixer config that builds a labels map that contains the full set of desired attributes
  3. creating the matching template string (if that is desired)

With that, Mixer could generate the access log on every request and send it to any number of configurable backends that support the access logs aspect.

We will be refining the Mixer config story (in particular around auto-generating logs) in the next few weeks. It would be cool to support the envoy logging bits directly. I'm happy to discuss further at any point.

@kyessenov
Copy link
Copy Markdown
Contributor

@htuch there is a lot of complexity in the mixer story, but here is the simple proxy point of view:

  1. we have a filter that is essentially an "access log" filter that hooks into the same request path flow.
  2. this filter creates a list of "attribute=value" pairs where attributes correspond to the request metadata; these pairs are sent to the central mixer server.
  3. mixer then creates a global access log and pipes into backend metrics/logging systems

Update field names for clarity and add more documentation
Refactor HeaderValue to separate append boolean
Use enum for request method
@dnoe dnoe changed the base branch from master to staging May 22, 2017 18:06
@dnoe
Copy link
Copy Markdown
Contributor Author

dnoe commented May 22, 2017

Updating base for this PR to staging per logistics conversation with @htuch

@mandarjog
Copy link
Copy Markdown
Contributor

This part is perhaps covered in the metadata bit. However specifically for the Mixer filter, it is important to record

  1. Check operation response.
  2. Quota / rate limit operation response

The accesslog should support struct or Any fields keyed by the producing filter.

@htuch
Copy link
Copy Markdown
Member

htuch commented May 24, 2017

@mandarjog @douglas-reid @kyessenov #59 is now tracking the filter produced metadata. We can add support to the access log proto there. In the interest of merging this PR, I think it's good to go without it as we haven't fully fleshed out the design for #59 yet.

Based on what @douglas-reid and @kyessenov say, I think the proto in this PR is sufficiently expressive for the Mixer, as long as all fields are populated. This will be under the control of https://github.com/lyft/envoy-api/blob/master/api/filter/http_connection_manager.proto#L114 when it gets added (@dpn do you have this on your radar?). The Mixer is always free to destruct the proto into an unstructured dictionary when it receives it (by converting to JSON and back for example) and treat the log as a some keyable data source of attributes that can be combined with the aspect rule. Does that sound like it will work?

Thanks for the input, this is useful in guiding the use with Istio.

@dnoe
Copy link
Copy Markdown
Contributor Author

dnoe commented May 24, 2017

I was planning to initialize the existing format string into an object that fills out the requested fields from the string into the proto. I could also create a completely proto representation of the format fields if you think it'd be preferable but I don't think it's necessary.

@dnoe
Copy link
Copy Markdown
Contributor Author

dnoe commented May 24, 2017

Looking at http_connection_manager.proto I do want to add some fields into the AccessLog proto for the new TLS fields mentioned, I'll push that shortly.

@douglas-reid
Copy link
Copy Markdown
Contributor

@htuch that sounds reasonable to me. thanks.

These are new access log format fields from
http_connection_manager.proto new config API.
@htuch htuch merged commit fff2b30 into envoyproxy:staging May 25, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants