Better error handling using absl::StatusOr#646
Conversation
…Error Signed-off-by: Rajdeep Roy Chowdhury <rrajdeeproychowdhury@gmail.com>
oschaaf
left a comment
There was a problem hiding this comment.
Thanks for looking into this; a small issue is that CI flags a formatting issue, that can be resolved via running ci/do_ci.sh fix_format.
But I'm also requesting some changes because while this builds I think it effectively doesn't change much, as we don't check the status_or to see if everything is ok before consuming it. Let me try to point out some examples that show the pattern of how to best leverage absl::status_or:
Generally, I think we need to propagate the return value of getJsonStringFromMessageOrError where we consume it, except at near or at the highest level of the program, where it can be handled. So probably the return value of formatProto() needs to change.
That looks like something along the lines of this:
And then, as an example, here's a place where we consume formatProto() at the highest level:
There, instead of optimistically assuming all is well and streaming out the result, we should check for message_or_error.ok(), and handle the case where there was things ar not ok. Because if there's an error we should exit with a non-zero status code, log what the status was, and not attempt to stream the status_or to the stdout.
Transitioning from std::string to absl::StatusOr for better error handling Signed-off-by: Rajdeep Roy Chowdhury <rrajdeeproychowdhury@gmail.com>
oschaaf
left a comment
There was a problem hiding this comment.
Thanks for pushing these changes, this looks great!
Now to be able to close #625 we need one more thing: the Fortio output formatter still throws exceptions in two [1] spots that I know of; instead of doing that it should also apply the absl::status_or pattern we introduced here.
[1]
and| if (formatted_proto.ok()) { | ||
| std::cout << formatter->formatProto(output_collector.toProto()); | ||
| } else { | ||
| ENVOY_LOG(error, "An error occured while formatting proto"); |
There was a problem hiding this comment.
should we force result to false here? if this code line get hit, that probably means we should indicate failure.
alternatively, we could consider propagating absl::status_or even further up (so we return that instead of the boolean we use return right now) if you'd be up for it, but feel free to not do this, I don't intend to bloat this PR.
There was a problem hiding this comment.
What shall be the status code for [1]?
There was a problem hiding this comment.
Consider setting result to false and fall trough, to have the if(!result) {... clause handle the logging/error handling. Minimizing the control logic helps keeping coverage up. If we want to specialise the message that gets logged we can declare/set that separately.
Signed-off-by: Rajdeep Roy Chowdhury <rrajdeeproychowdhury@gmail.com>
Signed-off-by: Rajdeep Roy Chowdhury <rrajdeeproychowdhury@gmail.com>
Signed-off-by: Rajdeep Roy Chowdhury <rrajdeeproychowdhury@gmail.com>
dubious90
left a comment
There was a problem hiding this comment.
Thanks for the contribution! Have a couple comments on returning strings.
Signed-off-by: Rajdeep Roy Chowdhury <rrajdeeproychowdhury@gmail.com>
Signed-off-by: Rajdeep Roy Chowdhury <rrajdeeproychowdhury@gmail.com>
Signed-off-by: Rajdeep Roy Chowdhury <rrajdeeproychowdhury@gmail.com>
Signed-off-by: Rajdeep Roy Chowdhury <rrajdeeproychowdhury@gmail.com>
Signed-off-by: Rajdeep Roy Chowdhury <rrajdeeproychowdhury@gmail.com>
Signed-off-by: Rajdeep Roy Chowdhury <rrajdeeproychowdhury@gmail.com>
Signed-off-by: Rajdeep Roy Chowdhury <rrajdeeproychowdhury@gmail.com>
Signed-off-by: Rajdeep Roy Chowdhury <rrajdeeproychowdhury@gmail.com>
Signed-off-by: Rajdeep Roy Chowdhury <rrajdeeproychowdhury@gmail.com>
Signed-off-by: Rajdeep Roy Chowdhury <rrajdeeproychowdhury@gmail.com>
Signed-off-by: Rajdeep Roy Chowdhury <rrajdeeproychowdhury@gmail.com>
Signed-off-by: Rajdeep Roy Chowdhury <rrajdeeproychowdhury@gmail.com>
Signed-off-by: Rajdeep Roy Chowdhury <rrajdeeproychowdhury@gmail.com>
Signed-off-by: Rajdeep Roy Chowdhury <rrajdeeproychowdhury@gmail.com>
…esult Signed-off-by: Rajdeep Roy Chowdhury <rrajdeeproychowdhury@gmail.com>
Signed-off-by: Rajdeep Roy Chowdhury <rrajdeeproychowdhury@gmail.com>
Some parts of the code is impossible/very hard to cover Signed-off-by: Rajdeep Roy Chowdhury <rrajdeeproychowdhury@gmail.com>
|
I cannot find a way to intensionally fail |
|
/retest |
|
🔨 rebuilding |
oschaaf
left a comment
There was a problem hiding this comment.
This is great stuff, thanks so much for doing this. Some last comments.
ci/do_ci.sh
Outdated
| function do_unit_test_coverage() { | ||
| export TEST_TARGETS="//test/... -//test:python_test" | ||
| export COVERAGE_THRESHOLD=94.2 | ||
| export COVERAGE_THRESHOLD=93.0 |
There was a problem hiding this comment.
This seems like a rather large decrease, looking at the coverage report we could set this set this at 93.9 with the current state.
There was a problem hiding this comment.
having said that, looking at https://26453-180498819-gh.circle-artifacts.com/0/coverage/source/client/output_formatter_impl.cc.gcov.html, I think it should be possible to hit some of the failure cases for the Fortio output formatter, by creating a test that provides an input that doesn't include a "global" result. As far as I am concerned this can be moved to a new issue though.
| /** | ||
| * @return std::string serialized representation of output. The specific format depends | ||
| * on the derived class, for example human-readable or json. | ||
| * @return absl::StatusOr<std::string> serialized representation of output, if not error. |
There was a problem hiding this comment.
Consider s/, if not/or an/ to make line flow a little better.
| if (formatted_proto.ok()) { | ||
| std::cout << formatter->formatProto(output_collector.toProto()); | ||
| } else { | ||
| ENVOY_LOG(error, "An error occured while formatting proto"); |
There was a problem hiding this comment.
Consider setting result to false and fall trough, to have the if(!result) {... clause handle the logging/error handling. Minimizing the control logic helps keeping coverage up. If we want to specialise the message that gets logged we can declare/set that separately.
|
|
||
| #include "common/version_info.h" | ||
|
|
||
| #include "absl/status/status.h" |
| auto status_proto = Envoy::MessageUtil::getJsonStringFromMessage(output, true, true); | ||
| if (status_proto.ok()) { | ||
| return status_proto.value(); | ||
| } else { |
There was a problem hiding this comment.
Consider the ternary operator here, less code, less impact on coverage for the lines we can't cover. And personally I like that it makes it slightly more obvious that the method inescapable returns a value at a glance.
There was a problem hiding this comment.
if the else part of this code doesn't ever execute, how is the idea of simply returning status_proto.value()?
We can get rid of the if-else branching.
There was a problem hiding this comment.
Hmm, yes. In fact, I think we can just return status_proto? E.g. condense the whole method body to a one-liner, return Envoy::MessageUtil::getJsonStringFromMessage(output, true, true);
| const auto& nh_global_result = getGlobalResult(output); | ||
| auto nh_global_result_status = getGlobalResult(output); | ||
| if (!nh_global_result_status.ok()) { | ||
| return absl::Status(nh_global_result_status.status().code(), |
There was a problem hiding this comment.
This could be covered in a test as well, I think
There was a problem hiding this comment.
Even if we create a test with input that doesn't contain a global result, it gets handled at
therefore, the lines @oschaaf mentioned remain uncovered.
There was a problem hiding this comment.
That makes me wonder if it's even possible to hit this code.. if that's not the case it means this is dead code, and we ought to remove it. IMHO that would make sense and is related to why I figured that throwing an exception in this case didn't make sense. It looks like the Fortio UI would indicate an error on the output we'd produce, which is sufficient:
https://github.com/fortio/fortio/blob/ee098c572115f4149a67d2a812cb3d1ca79e30e1/ui/static/js/fortio_chart.js#L105
We should probably make sure that this really works though.
Signed-off-by: Rajdeep Roy Chowdhury <rrajdeeproychowdhury@gmail.com>
Signed-off-by: Rajdeep Roy Chowdhury <rrajdeeproychowdhury@gmail.com>
Signed-off-by: Rajdeep Roy Chowdhury <rrajdeeproychowdhury@gmail.com>
Signed-off-by: Rajdeep Roy Chowdhury <rrajdeeproychowdhury@gmail.com>
oschaaf
left a comment
There was a problem hiding this comment.
LGTM, very nice. One last nit: the git commit signoff is missing in the last commit.. as for the changes in here they look good to me; I am deferring to @mum4k & @dubious90 for final approval. Thanks a lot for persisting!
This reverts commit a8d5fde. Signed-off-by: Rajdeep Roy Chowdhury <rrajdeeproychowdhury@gmail.com>
4634cac to
6ab1a36
Compare
source/client/client.cc
Outdated
| absl::StatusOr<std::string> formatted_proto = formatter->formatProto(output_collector.toProto()); | ||
| if (!formatted_proto.ok()) { | ||
| ENVOY_LOG(error, "An error occured while formatting proto"); | ||
| return false; |
There was a problem hiding this comment.
returning here prevents us from calling process->shutdown below, and I'm worried that's a problem. Could we instead set result to false? This would have the same effect except it would call shutdown and log that an error occurred.
| return Envoy::MessageUtil::getYamlStringFromMessage(output, true, true); | ||
| absl::StatusOr<std::string> | ||
| YamlOutputFormatterImpl::formatProto(const nighthawk::client::Output& output) const { | ||
| std::string yaml_string = Envoy::MessageUtil::getYamlStringFromMessage(output, true, true); |
There was a problem hiding this comment.
Can we avoid this added local copy and just return getYamlStringFromMessage as we were doing before?
| } | ||
|
|
||
| const nighthawk::client::Result& | ||
| absl::StatusOr<const nighthawk::client::Result> |
There was a problem hiding this comment.
Was it intentional for this to stop being a reference that is returned?
There was a problem hiding this comment.
This was intentional. Putting a reference inside absl::StatusOr<> was causing a compilation error.
@oschaaf told that it isn't that performance-critical, so can be switch to non-reference.
| auto actual_duration = getAverageExecutionDuration(output); | ||
| auto actual_duration_status = getAverageExecutionDuration(output); | ||
| if (!actual_duration_status.ok()) { | ||
| return absl::Status(absl::StatusCode::kInternal, |
There was a problem hiding this comment.
Note that with this current code you're actually dropping the status result from getAverageExecutionDuration on the floor and returning a more generic message. Could you instead just return it if it's an error? (Related to my comment above about improving the error message there, so that it gives all needed context.).
| const nighthawk::client::Output& output) const { | ||
| if (!output.results_size()) { | ||
| throw NighthawkException("No results in output"); | ||
| return absl::Status(absl::StatusCode::kNotFound, "No results in output"); |
There was a problem hiding this comment.
This message doesn't really give the context about why it's looking. Could we make it something like "No results in output while getting average execution duration" or "getAverageExecutionDuration found no results in output"?
An important drawback of statuses as compared to errors that is important to keep in mind is that they don't have stacktraces.
| // Get the result that represents all workers (global) | ||
| const auto& nh_global_result = getGlobalResult(output); | ||
| auto nh_global_result_status = getGlobalResult(output); | ||
| const auto& nh_global_result = nh_global_result_status.value(); |
There was a problem hiding this comment.
I know that you and Otto had a conversation here about whether or not to include a line here returning the status if it's not ok() and deciding not to. I think I disagree with the conclusion. Could we please return the status if it's not ok here?
Reasoning:
-
I don't think it's a good idea to code with perfect understanding of a function's error state, because functions aren't static - they change over time. While it may not be possible for getGlobalResult to actually return an error yet, it might be later, and now this code wouldn't operate as expected.
-
Removing lines just because covering them isn't possible is a distorted incentive. Coverage should never discourage us from doing the right thing.
There was a problem hiding this comment.
This comment drew me into looking into this once more; what I think might be appropriate is for getGlobalResult() is to return an absl:::optional<const nighthawk::client::Result> instead of a status: I can't anticipate failures modes where an exception wouldn't be appropriate, and not having a global result in the output is business as usual in terms of transforming to Fortio output.
| OutputFormatterFactoryImpl factory; | ||
| OutputFormatterPtr formatter = factory.create(translated_format); | ||
| std::cout << formatter->formatProto(output); | ||
| auto format_status = formatter->formatProto(output); |
There was a problem hiding this comment.
I wasn't calling it out in other pre-existing locations, but we should try to avoid using auto unless the type is fully obvious from context (which it isn't here). Can you please declare the full type?
Other changes include: - More descriptive error messages - auto types changed to explicit types - got rid of unnecessary local variable in YamlOutputFormatterImpl::formatProto Signed-off-by: Rajdeep Roy Chowdhury <rrajdeeproychowdhury@gmail.com>
Signed-off-by: Rajdeep Roy Chowdhury <rrajdeeproychowdhury@gmail.com>
Signed-off-by: Rajdeep Roy Chowdhury <rrajdeeproychowdhury@gmail.com>
Signed-off-by: Rajdeep Roy Chowdhury <rrajdeeproychowdhury@gmail.com>
| @@ -89,7 +89,7 @@ bool Main::run() { | |||
| ENVOY_LOG(error, "An error occured while formatting proto"); | |||
| result = false; | |||
| } | |||
There was a problem hiding this comment.
Oh I think we need:
if (!formatted_proto.ok()) {
// .... <snip> looks ok to me as-is.
} else {
// only when formatted_proto.ok() == true
std::cout << *formatted_proto;
}I'm sorry if my earlier comment here was off or caused confusion.
Signed-off-by: Rajdeep Roy Chowdhury <rrajdeeproychowdhury@gmail.com>
|
@dubious90 Please do another round of review |
Fixes #625
Replacing getJsonStringFromMessageOrDie() by getJsonStringFromMessageOrError().
Signed-off-by: Rajdeep Roy Chowdhury rrajdeeproychowdhury@gmail.com