Skip to content

Better error handling using absl::StatusOr#646

Merged
dubious90 merged 34 commits intoenvoyproxy:mainfrom
Razdeep:protobuf_safe_parsing
Mar 27, 2021
Merged

Better error handling using absl::StatusOr#646
dubious90 merged 34 commits intoenvoyproxy:mainfrom
Razdeep:protobuf_safe_parsing

Conversation

@Razdeep
Copy link
Copy Markdown
Contributor

@Razdeep Razdeep commented Mar 15, 2021

Fixes #625
Replacing getJsonStringFromMessageOrDie() by getJsonStringFromMessageOrError().

Signed-off-by: Rajdeep Roy Chowdhury rrajdeeproychowdhury@gmail.com

…Error

Signed-off-by: Rajdeep Roy Chowdhury <rrajdeeproychowdhury@gmail.com>
Copy link
Copy Markdown
Member

@oschaaf oschaaf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for looking into this; a small issue is that CI flags a formatting issue, that can be resolved via running ci/do_ci.sh fix_format.

But I'm also requesting some changes because while this builds I think it effectively doesn't change much, as we don't check the status_or to see if everything is ok before consuming it. Let me try to point out some examples that show the pattern of how to best leverage absl::status_or:

Generally, I think we need to propagate the return value of getJsonStringFromMessageOrError where we consume it, except at near or at the highest level of the program, where it can be handled. So probably the return value of formatProto() needs to change.

That looks like something along the lines of this:

absl::Status validation_status = config_factory.ValidateConfig(config.typed_config());

And then, as an example, here's a place where we consume formatProto() at the highest level:

std::cout << formatter->formatProto(output);

There, instead of optimistically assuming all is well and streaming out the result, we should check for message_or_error.ok(), and handle the case where there was things ar not ok. Because if there's an error we should exit with a non-zero status code, log what the status was, and not attempt to stream the status_or to the stdout.

Transitioning from std::string to absl::StatusOr for better error
handling

Signed-off-by: Rajdeep Roy Chowdhury <rrajdeeproychowdhury@gmail.com>
Copy link
Copy Markdown
Member

@oschaaf oschaaf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pushing these changes, this looks great!

Now to be able to close #625 we need one more thing: the Fortio output formatter still throws exceptions in two [1] spots that I know of; instead of doing that it should also apply the absl::status_or pattern we introduced here.

[1]

throw NighthawkException("Nighthawk output was malformed, contains no 'global' results.");
and
throw NighthawkException("No results in output");

if (formatted_proto.ok()) {
std::cout << formatter->formatProto(output_collector.toProto());
} else {
ENVOY_LOG(error, "An error occured while formatting proto");
Copy link
Copy Markdown
Member

@oschaaf oschaaf Mar 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we force result to false here? if this code line get hit, that probably means we should indicate failure.

alternatively, we could consider propagating absl::status_or even further up (so we return that instead of the boolean we use return right now) if you'd be up for it, but feel free to not do this, I don't intend to bloat this PR.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What shall be the status code for [1]?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider setting result to false and fall trough, to have the if(!result) {... clause handle the logging/error handling. Minimizing the control logic helps keeping coverage up. If we want to specialise the message that gets logged we can declare/set that separately.

@oschaaf oschaaf added the waiting-for-changes A PR waiting for comments to be resolved and changes to be applied. label Mar 16, 2021
@Razdeep Razdeep changed the title migrating all getJsonStringFromMessageOrDie to getJsonStringFromMessageOrError Better error handling using absl::StatusOr Mar 16, 2021
Razdeep added 3 commits March 16, 2021 22:09
Signed-off-by: Rajdeep Roy Chowdhury <rrajdeeproychowdhury@gmail.com>
Signed-off-by: Rajdeep Roy Chowdhury <rrajdeeproychowdhury@gmail.com>
Signed-off-by: Rajdeep Roy Chowdhury <rrajdeeproychowdhury@gmail.com>
Copy link
Copy Markdown
Contributor

@dubious90 dubious90 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution! Have a couple comments on returning strings.

Razdeep added 6 commits March 16, 2021 23:20
Signed-off-by: Rajdeep Roy Chowdhury <rrajdeeproychowdhury@gmail.com>
Signed-off-by: Rajdeep Roy Chowdhury <rrajdeeproychowdhury@gmail.com>
Signed-off-by: Rajdeep Roy Chowdhury <rrajdeeproychowdhury@gmail.com>
Signed-off-by: Rajdeep Roy Chowdhury <rrajdeeproychowdhury@gmail.com>
Signed-off-by: Rajdeep Roy Chowdhury <rrajdeeproychowdhury@gmail.com>
Signed-off-by: Rajdeep Roy Chowdhury <rrajdeeproychowdhury@gmail.com>
@Razdeep Razdeep changed the title Better error handling using absl::StatusOr WIP: Better error handling using absl::StatusOr Mar 17, 2021
Razdeep added 12 commits March 17, 2021 17:23
Signed-off-by: Rajdeep Roy Chowdhury <rrajdeeproychowdhury@gmail.com>
Signed-off-by: Rajdeep Roy Chowdhury <rrajdeeproychowdhury@gmail.com>
Signed-off-by: Rajdeep Roy Chowdhury <rrajdeeproychowdhury@gmail.com>
Signed-off-by: Rajdeep Roy Chowdhury <rrajdeeproychowdhury@gmail.com>
Signed-off-by: Rajdeep Roy Chowdhury <rrajdeeproychowdhury@gmail.com>
Signed-off-by: Rajdeep Roy Chowdhury <rrajdeeproychowdhury@gmail.com>
Signed-off-by: Rajdeep Roy Chowdhury <rrajdeeproychowdhury@gmail.com>
Signed-off-by: Rajdeep Roy Chowdhury <rrajdeeproychowdhury@gmail.com>
…esult

Signed-off-by: Rajdeep Roy Chowdhury <rrajdeeproychowdhury@gmail.com>
Signed-off-by: Rajdeep Roy Chowdhury <rrajdeeproychowdhury@gmail.com>
Some parts of the code is impossible/very hard to cover

Signed-off-by: Rajdeep Roy Chowdhury <rrajdeeproychowdhury@gmail.com>
@Razdeep Razdeep changed the title WIP: Better error handling using absl::StatusOr Better error handling using absl::StatusOr Mar 20, 2021
@Razdeep
Copy link
Copy Markdown
Contributor Author

Razdeep commented Mar 20, 2021

I cannot find a way to intensionally fail JsonFormatter::formatProto, YamlFormatter::formatProto, DottedFormatter::formatProto. Thus the test-coverage is reduced.

@oschaaf
Copy link
Copy Markdown
Member

oschaaf commented Mar 20, 2021

/retest

@repokitteh-read-only
Copy link
Copy Markdown

🔨 rebuilding ci/circleci: test_gcc (failed build)

🐱

Caused by: a #646 (comment) was created by @oschaaf.

see: more, trace.

Copy link
Copy Markdown
Member

@oschaaf oschaaf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great stuff, thanks so much for doing this. Some last comments.

ci/do_ci.sh Outdated
function do_unit_test_coverage() {
export TEST_TARGETS="//test/... -//test:python_test"
export COVERAGE_THRESHOLD=94.2
export COVERAGE_THRESHOLD=93.0
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a rather large decrease, looking at the coverage report we could set this set this at 93.9 with the current state.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

having said that, looking at https://26453-180498819-gh.circle-artifacts.com/0/coverage/source/client/output_formatter_impl.cc.gcov.html, I think it should be possible to hit some of the failure cases for the Fortio output formatter, by creating a test that provides an input that doesn't include a "global" result. As far as I am concerned this can be moved to a new issue though.

/**
* @return std::string serialized representation of output. The specific format depends
* on the derived class, for example human-readable or json.
* @return absl::StatusOr<std::string> serialized representation of output, if not error.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider s/, if not/or an/ to make line flow a little better.

if (formatted_proto.ok()) {
std::cout << formatter->formatProto(output_collector.toProto());
} else {
ENVOY_LOG(error, "An error occured while formatting proto");
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider setting result to false and fall trough, to have the if(!result) {... clause handle the logging/error handling. Minimizing the control logic helps keeping coverage up. If we want to specialise the message that gets logged we can declare/set that separately.


#include "common/version_info.h"

#include "absl/status/status.h"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this include?

auto status_proto = Envoy::MessageUtil::getJsonStringFromMessage(output, true, true);
if (status_proto.ok()) {
return status_proto.value();
} else {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider the ternary operator here, less code, less impact on coverage for the lines we can't cover. And personally I like that it makes it slightly more obvious that the method inescapable returns a value at a glance.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the else part of this code doesn't ever execute, how is the idea of simply returning status_proto.value()?
We can get rid of the if-else branching.

Copy link
Copy Markdown
Member

@oschaaf oschaaf Mar 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, yes. In fact, I think we can just return status_proto? E.g. condense the whole method body to a one-liner, return Envoy::MessageUtil::getJsonStringFromMessage(output, true, true);

const auto& nh_global_result = getGlobalResult(output);
auto nh_global_result_status = getGlobalResult(output);
if (!nh_global_result_status.ok()) {
return absl::Status(nh_global_result_status.status().code(),
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be covered in a test as well, I think

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even if we create a test with input that doesn't contain a global result, it gets handled at

if (!actual_duration_status.ok()) {

therefore, the lines @oschaaf mentioned remain uncovered.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes me wonder if it's even possible to hit this code.. if that's not the case it means this is dead code, and we ought to remove it. IMHO that would make sense and is related to why I figured that throwing an exception in this case didn't make sense. It looks like the Fortio UI would indicate an error on the output we'd produce, which is sufficient:
https://github.com/fortio/fortio/blob/ee098c572115f4149a67d2a812cb3d1ca79e30e1/ui/static/js/fortio_chart.js#L105

We should probably make sure that this really works though.

Razdeep added 4 commits March 20, 2021 17:47
Signed-off-by: Rajdeep Roy Chowdhury <rrajdeeproychowdhury@gmail.com>
Signed-off-by: Rajdeep Roy Chowdhury <rrajdeeproychowdhury@gmail.com>
Signed-off-by: Rajdeep Roy Chowdhury <rrajdeeproychowdhury@gmail.com>
Signed-off-by: Rajdeep Roy Chowdhury <rrajdeeproychowdhury@gmail.com>
oschaaf
oschaaf previously approved these changes Mar 21, 2021
Copy link
Copy Markdown
Member

@oschaaf oschaaf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, very nice. One last nit: the git commit signoff is missing in the last commit.. as for the changes in here they look good to me; I am deferring to @mum4k & @dubious90 for final approval. Thanks a lot for persisting!

This reverts commit a8d5fde.

Signed-off-by: Rajdeep Roy Chowdhury <rrajdeeproychowdhury@gmail.com>
@Razdeep Razdeep force-pushed the protobuf_safe_parsing branch from 4634cac to 6ab1a36 Compare March 22, 2021 03:13
@oschaaf oschaaf added waiting-for-review A PR waiting for a review. and removed waiting-for-changes A PR waiting for comments to be resolved and changes to be applied. labels Mar 22, 2021
absl::StatusOr<std::string> formatted_proto = formatter->formatProto(output_collector.toProto());
if (!formatted_proto.ok()) {
ENVOY_LOG(error, "An error occured while formatting proto");
return false;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

returning here prevents us from calling process->shutdown below, and I'm worried that's a problem. Could we instead set result to false? This would have the same effect except it would call shutdown and log that an error occurred.

return Envoy::MessageUtil::getYamlStringFromMessage(output, true, true);
absl::StatusOr<std::string>
YamlOutputFormatterImpl::formatProto(const nighthawk::client::Output& output) const {
std::string yaml_string = Envoy::MessageUtil::getYamlStringFromMessage(output, true, true);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we avoid this added local copy and just return getYamlStringFromMessage as we were doing before?

}

const nighthawk::client::Result&
absl::StatusOr<const nighthawk::client::Result>
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was it intentional for this to stop being a reference that is returned?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was intentional. Putting a reference inside absl::StatusOr<> was causing a compilation error.
@oschaaf told that it isn't that performance-critical, so can be switch to non-reference.

auto actual_duration = getAverageExecutionDuration(output);
auto actual_duration_status = getAverageExecutionDuration(output);
if (!actual_duration_status.ok()) {
return absl::Status(absl::StatusCode::kInternal,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that with this current code you're actually dropping the status result from getAverageExecutionDuration on the floor and returning a more generic message. Could you instead just return it if it's an error? (Related to my comment above about improving the error message there, so that it gives all needed context.).

const nighthawk::client::Output& output) const {
if (!output.results_size()) {
throw NighthawkException("No results in output");
return absl::Status(absl::StatusCode::kNotFound, "No results in output");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This message doesn't really give the context about why it's looking. Could we make it something like "No results in output while getting average execution duration" or "getAverageExecutionDuration found no results in output"?

An important drawback of statuses as compared to errors that is important to keep in mind is that they don't have stacktraces.

// Get the result that represents all workers (global)
const auto& nh_global_result = getGlobalResult(output);
auto nh_global_result_status = getGlobalResult(output);
const auto& nh_global_result = nh_global_result_status.value();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know that you and Otto had a conversation here about whether or not to include a line here returning the status if it's not ok() and deciding not to. I think I disagree with the conclusion. Could we please return the status if it's not ok here?

Reasoning:

  1. I don't think it's a good idea to code with perfect understanding of a function's error state, because functions aren't static - they change over time. While it may not be possible for getGlobalResult to actually return an error yet, it might be later, and now this code wouldn't operate as expected.

  2. Removing lines just because covering them isn't possible is a distorted incentive. Coverage should never discourage us from doing the right thing.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment drew me into looking into this once more; what I think might be appropriate is for getGlobalResult() is to return an absl:::optional<const nighthawk::client::Result> instead of a status: I can't anticipate failures modes where an exception wouldn't be appropriate, and not having a global result in the output is business as usual in terms of transforming to Fortio output.

OutputFormatterFactoryImpl factory;
OutputFormatterPtr formatter = factory.create(translated_format);
std::cout << formatter->formatProto(output);
auto format_status = formatter->formatProto(output);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't calling it out in other pre-existing locations, but we should try to avoid using auto unless the type is fully obvious from context (which it isn't here). Can you please declare the full type?

@dubious90 dubious90 self-assigned this Mar 22, 2021
@dubious90 dubious90 added waiting-for-changes A PR waiting for comments to be resolved and changes to be applied. and removed waiting-for-review A PR waiting for a review. labels Mar 22, 2021
Razdeep added 5 commits March 23, 2021 14:23
Other changes include:
 - More descriptive error messages
 - auto types changed to explicit types
 - got rid of unnecessary local variable in YamlOutputFormatterImpl::formatProto

Signed-off-by: Rajdeep Roy Chowdhury <rrajdeeproychowdhury@gmail.com>
Signed-off-by: Rajdeep Roy Chowdhury <rrajdeeproychowdhury@gmail.com>
Signed-off-by: Rajdeep Roy Chowdhury <rrajdeeproychowdhury@gmail.com>
Signed-off-by: Rajdeep Roy Chowdhury <rrajdeeproychowdhury@gmail.com>
@@ -89,7 +89,7 @@ bool Main::run() {
ENVOY_LOG(error, "An error occured while formatting proto");
result = false;
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I think we need:

if (!formatted_proto.ok()) {
  // .... <snip> looks ok to me as-is.
} else {
  // only when formatted_proto.ok() == true
  std::cout << *formatted_proto;
}

I'm sorry if my earlier comment here was off or caused confusion.

Signed-off-by: Rajdeep Roy Chowdhury <rrajdeeproychowdhury@gmail.com>
@Razdeep
Copy link
Copy Markdown
Contributor Author

Razdeep commented Mar 27, 2021

@dubious90 Please do another round of review

@dubious90 dubious90 added waiting-for-review A PR waiting for a review. and removed waiting-for-changes A PR waiting for comments to be resolved and changes to be applied. labels Mar 27, 2021
@dubious90 dubious90 merged commit 27ac26d into envoyproxy:main Mar 27, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

waiting-for-review A PR waiting for a review.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Solidify output formatter implementations

3 participants