Skip to content

Add tracking of response body-size and header-size#265

Merged
mum4k merged 81 commits intoenvoyproxy:masterfrom
oschaaf:response-size-tracking
Jan 21, 2020
Merged

Add tracking of response body-size and header-size#265
mum4k merged 81 commits intoenvoyproxy:masterfrom
oschaaf:response-size-tracking

Conversation

@oschaaf
Copy link
Copy Markdown
Member

@oschaaf oschaaf commented Jan 9, 2020

PR summary:

  • Adds tracking of response body-size and header- size using the more lightweight StreamingStatistic.
  • Wires these through to all output formats. (Including the Fortio output, which is driving this effort).
  • Adds tracking if min/max in our statistics, and wires that through to our output.
  • Some small refactoring / enhancements in our statistics implementations
  • Fix edge case in StreamingStatistic::combine
  • Some new test and test enhancments, including a refresher of gold files.

The new statistics will show in the CLI as:

Response body size in bytes (20 samples)
min: 847 | mean: 847 | max: 847 | pstdev: 0

Response header size in bytes (20 samples)
min: 47282 | mean: 47308.3 | max: 47372 | pstdev: 20.6186

Assuming we don't need full blown histograms for this, this fixes #271
Note: I observed that the fortio seems to be tracking different things for size/headersize in fortio, depending on if the fast http client is used or not. As this seems like a bug, I didn't try to reproduce that specific behavior in NH.

Signed-off-by: Otto van der Schaaf oschaaf@we-amp.com


Waiting on:

Prerequisites to merging this

  • Fortio output: add missing fields & correct some of the output fields #226 to go in
  • Address @mum4k his comments from Fortio output: add missing fields & correct some of the output fields #226 (comment) here
  • Discuss fields not used in Fortio visualization and istio/tools, but that actually are emitted in Fortio's json output:
    • SocketCount (only actually set for one of the two http clients of I remember correctly)
    • AbortOn
    • Exactly (-> target fixed amount of queries/replies, possibly could translate to NH termination
      predicates, but not shown in visualization, not used in istio/tools)
  • Fortio's RetCodes
  • Discuss that Fortios Sizes and HeaderSizes in its json output are used in istio/tools and missing in NH's output. However, they are only used to compute an error ratio (slightly different from how Fortio does that itself, and only precisely status 200 seems to count as succes), as well as compute an average response payload size for output in the csv transform.
    In this PR we add simple tracking for the sizes (min/mean/max/stdev), and not full blown histogram tracking of sizes. It is worth nothing that Fortio seems to report different size values depending:

[1]

  • With -stdclient, Sizes reports response body size and 0-valued Header Sizes.
  "Sizes": {
    ...
    "Avg": 14,
  },
  "HeaderSizes": {
    ...
    "Avg": 0,
  },
  • Without -stdclient (which I think implies using the diy fastclient), Sizes reports body+header size for Sizes and body size for HeaderSizes. It's worth noting that it seems like Fortio may internally switch to stdclient when using https.
  "Sizes": {
    ...
    "Avg": 185,
  },
  "HeaderSizes": {
    ...
    "Avg": 171,
  }

- Define NH version in the code in a canonical place.
- Update the fortio output formatter - add missing version field
- Add RELEASE_PROCEDURE.md
- Consume the new version information in various places to make
  --version  output of executables sensible.

Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
@oschaaf oschaaf added waiting-for-review A PR waiting for a review. and removed waiting-for-changes A PR waiting for comments to be resolved and changes to be applied. labels Jan 14, 2020
Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
…nse-size-tracking

Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
@mum4k
Copy link
Copy Markdown
Collaborator

mum4k commented Jan 15, 2020

Please mark this again as waiting-for-review once the merge conflicts are resolved.

@mum4k mum4k added waiting-for-changes A PR waiting for comments to be resolved and changes to be applied. and removed waiting-for-review A PR waiting for a review. blocked A PR that is blocked by prerequisites. labels Jan 15, 2020
…cking

Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
@oschaaf oschaaf added waiting-for-review A PR waiting for a review. and removed waiting-for-changes A PR waiting for comments to be resolved and changes to be applied. labels Jan 15, 2020
Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
@mum4k
Copy link
Copy Markdown
Collaborator

mum4k commented Jan 15, 2020

@eric846 please review and assign back to me when ready. Looks to be a lot of lines, but most of them are golden files.

…cking

Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
…cking

Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
…cking

Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
@oschaaf oschaaf requested a review from jplevyak January 20, 2020 22:49
eric846
eric846 previously approved these changes Jan 21, 2020
Copy link
Copy Markdown
Contributor

@eric846 eric846 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

I would consider all my comments optional.

result->set_name(name.data(), name.size());
for (auto& statistic : statistics) {
*(result->add_statistics()) = statistic->toProto();
// TODO(#XXX): Looking at if the statistic id ends with "_size" to determine how it should be
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Insert issue number here

const auto& b = dynamic_cast<const SimpleStatistic&>(statistic);
auto combined = std::make_unique<SimpleStatistic>();

combined->min_ = a.min() > b.min() ? b.min() : a.min();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason not to use std::min() and std::max()?

// StreamingStatistic for the stats that track response sizes. Ideally we would have options
// for this to route the right stat to the right backend (HdrStatistic, SimpleStatistic,
// NullStatistic).
// TODO(#XXX): Create options and have the StatisticFactory consider those when instantiating
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Insert issue number here

if (domain == Statistic::SerializationDomain::DURATION) {
int64_t nanos;
nanos = count() == 0 ? 0 : static_cast<int64_t>(std::round(mean()));
statistic.mutable_mean()->set_seconds(nanos / 1000000000);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we change 1000000000 to (1000 * 1000 * 1000) to avoid accidents?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also since this pattern with /1000000000 and %1000000000, maybe there should be a helper method for setting a proto duration, taking mutable Duration* and int64_t nanos

uint64_t HdrStatistic::min() const {
return count() == 0 ? UINT64_MAX : hdr_value_at_percentile(histogram_, 0);
}
uint64_t HdrStatistic::max() const { return hdr_value_at_percentile(histogram_, 100); }
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will max() blow up if count()==0?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be ok with respect to that (see TypedStatisticTest::Empty)

return streaming_stats_->resistsCatastrophicCancellation();
}
uint64_t significantDigits() const override { return streaming_stats_->significantDigits(); }
StatisticPtr createNewInstance() const override { return std::make_unique<InMemoryStatistic>(); };
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 on the clever elimination of the factory.

When I saw this being called, I assumed createNewInstance() would create a copy of the existing object -- the most obvious explanation for why it would be an instance method on an existing object. What about createNewInstanceOfSameType() to clarify the meaning?

Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
@oschaaf
Copy link
Copy Markdown
Member Author

oschaaf commented Jan 21, 2020

@eric846 thanks for the review, comments addressed in 9aa9e98

@mum4k mum4k self-assigned this Jan 21, 2020
@mum4k mum4k merged commit 8626464 into envoyproxy:master Jan 21, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

P0 Highest priority waiting-for-review A PR waiting for a review.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add Fortio Sizes and HeaderSizes field

4 participants