Conversation
|
Big 👍. Though I think separating this from GIT_CURL_VERBOSE would be a good idea.
This could be a common log format (I'm partial to grohl of course). Something simple that the Git LFS tool can produce for something else to dump into splunk, statsd, spark, etc. |
|
Yeah that's what I was getting at in that first checkbox, it gets drowned out in the I like the common log format idea, dumping right to that would probably save a bit of memory while tracing, and some more request metadata could be added in. |
|
I've done a bunch of clean up on this and I think it does enough for now to continue other benchmarking work. With the latest changes, if you set |
There was a problem hiding this comment.
It may not be necessary to dump the url, since you can get it from the response object with res.Request.URL.
Maybe it makes too little difference, but you could build the transfer object outside of the lock.
t := &transfer{requestStats: reqstats, responseStats: resstats, url: req.URL.String()}
transfersLock.Lock()
transfers[res] = t
transfersLock.Unlock()There was a problem hiding this comment.
Yeah, we should build the object outside of the lock. Not a big deal, but feels better. I'm not sure what you mean by "dump the url", though.
At present, when the GIT_CURL_VERBOSE environment variable is set to a value such as "1" or "true", Git LFS client outputs some of the details of each HTTP request and response it makes to its standard error stream. We gather the HTTP request headers to be logged in the traceRequest() method of the Client structure from our "lfshttp" package. To gather the headers, the traceRequest() method uses the DumpRequest() function from the from the "net/http/httputil" package in the Go standard library. Our use of the DumpRequest() function for this purpose dates back to commit 6fafade of PR git-lfs#366 in 2015, when it was first introduced into what was then the traceHttpRequest() function in our "lfs" package. This function gradually evolved into the current traceRequest() method through a series of PRs. First, portions of the "lfs" package were migrated into a new "httputil" package in PR git-lfs#1226. That package was later removed in PR git-lfs#1846 after a new "lfsapi" package was added in PR git-lfs#1794 with a version of the traceRequest() method very similar to the current one. Finally, in PR git-lfs#3244 we moved the method into the current "lfshttp" package. Throughout these changes, however, our use of the DumpRequest() function has remained the same. However, since the Git LFS client only acts as an HTTP client and not an HTTP server, we should instead be using the DumpRequestOut() function from the standard "httputil" package to gather the headers of an HTTP request for logging purposes. Replacing our use of the DumpRequest() function with the DumpRequestOut() function would allow us to output a number of request headers that we currently fail to log, as was noted with respect to the Accept-Encoding header in issue git-lfs#6193. Since Go version 1.6, the "httputil" package's documentation has stated that the DumpRequest() function "should only be used by servers to debug client requests", and that the DumpRequestOut() function "is like DumpRequest but for outgoing client requests": https://pkg.go.dev/net/http/httputil@go1.6#DumpRequest https://pkg.go.dev/net/http/httputil@go1.6#DumpRequestOut However, at the time we first adopted the use of the DumpRequest() function, the Go documentation did not include these directives, and merely noted that "DumpRequest returns the as-received wire representation of req, optionally including the request body, for debugging" and that "DumpRequestOut is like DumpRequest but includes headers that the standard http.Transport adds, such as User-Agent": https://pkg.go.dev/net/http/httputil@go1.4#DumpRequest https://pkg.go.dev/net/http/httputil@go1.4#DumpRequestOut It is reasonable to assume, therefore, that when PR git-lfs#366 was written its authors simply assumed that the DumpRequest() function was sufficient, given the state of the Go documentation at that time. As we would prefer to log a more complete set of client HTTP request headers, though, particular the Accept-Encoding header which the Go "net/http" package automatically adds to outgoing requests, we now update our traceRequest() method to use the DumpRequestOut() function. We then also update our Go language TestVerbose*() test functions to confirm that the Accept-Encoding request header is now captured and reported when verbose logging is enabled. Note that without the change in this PR to the use of the DumpRequestOut() function, these revised tests would fail, since the Accept-Encoding header would not appear in headers returned by the DumpRequest() function.
At present, when the GIT_CURL_VERBOSE environment variable is set to a value such as "1" or "true", the Git LFS client outputs some of the details of each HTTP request and response it makes to its standard error stream. We gather the HTTP request headers to be logged in the traceRequest() method of the Client structure from our "lfshttp" package. To gather the headers, the traceRequest() method uses the DumpRequest() function from the from the "net/http/httputil" package in the Go standard library. Our use of the DumpRequest() function for this purpose dates back to commit 6fafade of PR git-lfs#366 in 2015, when it was first introduced into what was then the traceHttpRequest() function in our "lfs" package. This function gradually evolved into the current traceRequest() method through a series of PRs. First, portions of the "lfs" package were migrated into a new "httputil" package in PR git-lfs#1226. That package was later removed in PR git-lfs#1846 after a new "lfsapi" package was added in PR git-lfs#1794 with a version of the traceRequest() method very similar to the current one. Finally, in PR git-lfs#3244 we moved the method into the current "lfshttp" package. Throughout these changes, however, our use of the DumpRequest() function has remained the same. However, since the Git LFS client only acts as an HTTP client and not an HTTP server, we should instead be using the DumpRequestOut() function from the standard "httputil" package to gather the headers of an HTTP request for logging purposes. Replacing our use of the DumpRequest() function with the DumpRequestOut() function would allow us to output a number of request headers that we currently fail to log, as was noted with respect to the Accept-Encoding header in issue git-lfs#6193. Since Go version 1.6, the "httputil" package's documentation has stated that the DumpRequest() function "should only be used by servers to debug client requests", and that the DumpRequestOut() function "is like DumpRequest but for outgoing client requests": https://pkg.go.dev/net/http/httputil@go1.6#DumpRequest https://pkg.go.dev/net/http/httputil@go1.6#DumpRequestOut However, at the time we first adopted the use of the DumpRequest() function, the Go documentation did not include these directives, and merely noted that "DumpRequest returns the as-received wire representation of req, optionally including the request body, for debugging" and that "DumpRequestOut is like DumpRequest but includes headers that the standard http.Transport adds, such as User-Agent": https://pkg.go.dev/net/http/httputil@go1.4#DumpRequest https://pkg.go.dev/net/http/httputil@go1.4#DumpRequestOut It is reasonable to assume, therefore, that when PR git-lfs#366 was written its authors simply assumed that the DumpRequest() function was sufficient, given the state of the Go documentation at that time. As we would prefer to log a more complete set of client HTTP request headers, though, particularly the Accept-Encoding header which the Go "net/http" package automatically adds to outgoing requests, we now update our traceRequest() method to use the DumpRequestOut() function. We then also update some of our Go language TestVerbose*() test functions to confirm that the Accept-Encoding request header is now captured and reported when verbose logging is enabled. Note that without the change in this PR to the use of the DumpRequestOut() function, these revised tests would fail, since the Accept-Encoding header would not appear in headers returned by the DumpRequest() function.
I was doing some measuring today and added some code that might be generally useful with respect to some analytics functionality previously discussed.
What's in this PR is a modification of the HTTP tracing code that keeps track of the sizes and timings of every http request that goes through the default client. Once you have a response you can link it to a key in a way similar to statsd (though far less capable, at this point), e.g.:
It tracks the size of both the header and body portions of the transfers, as well as the timing. The header vs body sizes can be useful for determining the point at which the HTTP overhead overtakes the file size (e.g. thousands of small files in LFS).
I'd love to be able to break down the timings into the various pieces of the http request, but that's a whole other can of worms best left for another day.
As it stands, when tracing with
GIT_CURL_VERBOSE=1, you'll get output like this:or
or
Some things to consider before this is a real thing:
Future work could further aggregate these, send them off somewhere, make more statsd-like outputs (percentiles, sums, histograms, etc).