Skip to content

Coverage testing and docker flow fixes#358

Merged
mum4k merged 24 commits intoenvoyproxy:masterfrom
oschaaf:coverage
Jun 19, 2020
Merged

Coverage testing and docker flow fixes#358
mum4k merged 24 commits intoenvoyproxy:masterfrom
oschaaf:coverage

Conversation

@oschaaf
Copy link
Copy Markdown
Member

@oschaaf oschaaf commented Jun 12, 2020

Following some digging into stability of our coverage testing, creating this PR for testing CI
with some fixes / enhancements, as well as covering a few more lines in our
CLI tooling exception paths. There seems to be something racy going on where sometimes the coverage originating from our python integration tests is not counted.

Eventually I learned that things might stabilise when we switch the toolchain to clang 10. We follow Envoy's lead on that, so for now stopping further efforts on this.

Description of the changes in this PR:

  • Increase timeouts of ASAN/TSAN runs
  • Make envoy_build_sha.sh, which apparently broke in one of the Envoy updates
  • Cover some of the outer exception handling paths for Nighthawk's CLIs
  • Integration test for nighthawk_output_transform
  • Add coverage for code lines where it seemed trivial to do so (e.g. NullStatistic)
  • Fix run_nighthawk_bazel_coverage.sh to work on my machine
  • Stricter bash options in some scripts + changes to make that work

Prerequisites


Leaving some notes/learnings here:

  • there seems to be something racy going on where sometimes the coverage originating from
    our python integration tests is not counted. This is relatively infrequent in CI.
    • On my dev machine this structurally happens. This might be a toolchain/version issue.
    • On my dev machine, using the docker flow, I have difficulties: the docker container exits when running the full test suite, without leaving a trace why. bash traps in the container don't fire, no useful logging from docker. Eventually I was able to get past bazel fetching its deps, and then build just the python integration tests. As long as a single test is involved, things seem to work / the container gets to finish. It's remarkable that CI doesn't have this problem. Coverage structurally looks as expected in this scenario.
  • The link step is super memory hungry. Separating that step might be a ci perf opportunity. Coverage is the slowest CI job we have.

oschaaf added 21 commits June 11, 2020 21:04
Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
Experiment: see if adding back the tests brings back counting
coverage for some of the lines in the .h files that went missing.

Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
@oschaaf
Copy link
Copy Markdown
Member Author

oschaaf commented Jun 15, 2020

Now that #357 is in, readying this one up for review. This should further stabilize CI.

@oschaaf oschaaf marked this pull request as ready for review June 15, 2020 18:02
@oschaaf oschaaf added P1 waiting-for-review A PR waiting for a review. labels Jun 15, 2020
@oschaaf oschaaf mentioned this pull request Jun 15, 2020
@mum4k mum4k self-requested a review June 17, 2020 20:40
@mum4k mum4k self-assigned this Jun 17, 2020
Copy link
Copy Markdown
Collaborator

@mum4k mum4k left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much for improving the stability, this has been hitting us on multiple PRs. As an overall comment - can we try to enumerate the fixes we are making here? It makes it a bit hard to review the PR and relate the various changes to intentions.

assertEqual(counters["requestsource.internal.upstream_rq_200"], 4)


def run_service_with_args(args):
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add documentation for this helper? E.g.

def run_service_with_args(args):
  """Executes the Nighthawk service with the provided arguments.

  Args:
    args: A string, the command line arguments for the service, e.g. "--foo --bar".

  Returns:
    A tuple in the form (exit_code, output), where exit_code is the code the Nighthawk
    service terminated with and the output is its standard output.
  """

Comment about adding documentation applies to all new non-test public functions. We could consider making some of them private instead.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made added this comment, slightly modified, to run_service_with_args. And then made this function and others that reuse it private.


def test_grpc_service_help():
(exit_code, output) = run_service_with_args("--help")
assert (exit_code == 0)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since these are tests, we should likely use self.assertTrue instead of assert. I think assert just terminates the execution while self.assertFoo correctly records the test failure using the test framework.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, we're using pytest, I think this ought te be ok. quoting from the docs:

Due to pytest’s detailed assertion introspection, only plain assert statements are used. 

Posting the output of a sample failure (introduced on purpose):

    def test_output_transform_help():
      (exit_code, output) = _run_output_transform_with_args("--help")
>     assert (exit_code == 1)
E     assert 0 == 1
E       -0
E       +1

exit_code  = 0
output     = ('WARNING: Perftools heap leak checker is active -- Performance may suffer\n'
 '\n'
 'USAGE: \n'
.... <snip> ...

[1] https://docs.pytest.org/en/stable/

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having said that, we do have some helpers in utilities.py, like assertEqual. Switched to use that, for consistency with the rest of the test code.

#!/usr/bin/env python3
import pytest

from test.integration.utility import *
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we import specific modules or packages explicitly? It helps when debugging.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amended this in some places. But we may want to track this as tech debt in an issue, because this occurs in other places as well. I filed #371 for that (as tech-debt)

@mum4k mum4k removed the waiting-for-review A PR waiting for a review. label Jun 17, 2020
@mum4k mum4k added the waiting-for-changes A PR waiting for comments to be resolved and changes to be applied. label Jun 17, 2020
@oschaaf
Copy link
Copy Markdown
Member Author

oschaaf commented Jun 18, 2020

Thank you very much for improving the stability, this has been hitting us on multiple PRs. As an overall comment - can we try to enumerate the fixes we are making here? It makes it a bit hard to review the PR and relate the various changes to intentions.

Ah yes, sure: I updated the PR description to list what I did here. Let me know if that looks better.

I will address the other comments later today.

Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
@oschaaf
Copy link
Copy Markdown
Member Author

oschaaf commented Jun 18, 2020

@mum4k thanks, I left some remarks, and also pushed f0596d4 which hopefully addresses your comments.

@oschaaf oschaaf added waiting-for-review A PR waiting for a review. and removed waiting-for-changes A PR waiting for comments to be resolved and changes to be applied. labels Jun 18, 2020

from test.integration.integration_test_fixtures import http_test_server_fixture
from test.integration.utility import *
from test.integration.utility import (isSanitizerRun, assertEqual, assertIn, assertGreaterEqual,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this now imports functions from the utility module. What we really should be importing is the utility module itself.

from test.integration import utility

And then calls will look like.
utility.assertEqual

Ideally utility would have a better more descriptive name, but we can address that separately. The Python style we are following internally suggests to only import packages or modules, that way it is easy to say where some function is defined. If the reader just sees "assertEqual", they have to do some digging.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense, in the benchmark PR I noticed pep8 had trouble with the * imports too.
Changed what was relevant to this PR in 66f6585
Filed #371 earlier to track doing a pass over
the all the other code, to change the rest.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for filing the issue.

@mum4k mum4k added waiting-for-changes A PR waiting for comments to be resolved and changes to be applied. and removed waiting-for-review A PR waiting for a review. labels Jun 18, 2020
oschaaf added 2 commits June 18, 2020 23:57
Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
@oschaaf oschaaf added waiting-for-review A PR waiting for a review. and removed waiting-for-changes A PR waiting for comments to be resolved and changes to be applied. labels Jun 18, 2020
@mum4k mum4k merged commit 41d340a into envoyproxy:master Jun 19, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

P1 waiting-for-review A PR waiting for a review.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants