Skip to content

tests: refactor breakdown metrics tests to not override Agent internals#2316

Merged
trentm merged 6 commits intomasterfrom
trentm/test-refactor-breakdown-test
Sep 7, 2021
Merged

tests: refactor breakdown metrics tests to not override Agent internals#2316
trentm merged 6 commits intomasterfrom
trentm/test-refactor-breakdown-test

Conversation

@trentm
Copy link
Copy Markdown
Member

@trentm trentm commented Sep 3, 2021

This refactors breakdown.test.js to not need a resetAgent(...) that knows so much about Agent internals. Now that we have a usable agent.destroy(), use that and create a new Agent for each test case.

This also moves away from the test/_mock_http_client.js testing style of waiting for an expected number of sent events to decide when to end a test case. This style is brittle when there is a test failure: either hanging waiting for an event that isn't coming, or crashing with throw new Error('too many writes').

@trentm trentm self-assigned this Sep 3, 2021
@github-actions github-actions bot added the agent-nodejs Make available for APM Agents project planning. label Sep 3, 2021
@ghost
Copy link
Copy Markdown

ghost commented Sep 3, 2021

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Start Time: 2021-09-04T00:51:59.448+0000

  • Duration: 25 min 45 sec

  • Commit: d37a2ed

Test stats 🧪

Test Results
Failed 0
Passed 20
Skipped 0
Total 20

Trends 🧪

Image of Build Times

Image of Tests

… diff enough in node v10 that it loses that race reliably
Saw this in CI:
    # does not include transaction breakdown when disabled
    /home/runner/work/apm-agent-nodejs/apm-agent-nodejs/lib/metrics/reporter.js:56
	    this._agent._transport.sendMetricSet(metric)
				  ^
    TypeError: Cannot read property 'sendMetricSet' of null
	at afterAll (/home/runner/work/apm-agent-nodejs/apm-agent-nodejs/lib/metrics/reporter.js:56:31)
	at /home/runner/work/apm-agent-nodejs/apm-agent-nodejs/node_modules/after-all-results/index.js:20:25
	at _combinedTickCallback (internal/process/next_tick.js:131:7)

for (const metric of seen.values()) {
this._agent._transport.sendMetricSet(metric)
if (this._agent._transport) {
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewer note: This was added to not crash in the metrics reporting interval if, in the interim, agent.destroy() has been called -- which sets this._agent._transport to null. Similar is already done for sendSpan here:

if (agent._transport) agent._transport.sendSpan(payload)

Copy link
Copy Markdown
Contributor

@astorm astorm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On objection to this -- it does what it says on the tin and it's hard to argue with better isolated tests. Approving.

One thing I did notice -- in this new branch the breakdown.test.js tests take significantly longer to run. On my local dev machine the test/metrics/breakdown.test.js tests took 14.345 total seconds where the old suites took 0.369 seconds. This was using npx to invoke the tests via tape

With the new branch

$ time npx tape test/metrics/breakdown.test.js
// ...
ok 101 does not have un-ended spans

1..101
# tests 101
# pass  101

# ok
$ npx tape test/metrics/breakdown.test.js  0.42s user 0.08s system 3% cpu 14.345 total

Old/current master

$ time npx tape test/metrics/breakdown.test.js 
// ...
ok 101 does not have un-ended spans

1..101
# tests 101
# pass  101

# ok

$ npx tape test/metrics/breakdown.test.js  0.33s user 0.06s system 104% cpu 0.369 total

Not a deal breaker (it's stills seconds, not minutes), just an observation.

@trentm
Copy link
Copy Markdown
Member Author

trentm commented Sep 7, 2021

One thing I did notice -- in this new branch the breakdown.test.js tests take significantly longer to run.

Yes, that is unfortunate. The earlier test model of "do test assertions and end as soon as I get the expected number of items on the transport" is faster, because the items we are waiting for (the encoded span and the breakdown metrics) are sent (via _transport.sendSpan() and _transport.sendMetricSet()) very soon later -- but asynchronously.

The test model I switched to here is:

  1. Do the thing (create some transactions and spans)
  2. Use whatever API is available on the public API to wait until we are sure the agent will have sent all items (transactions, spans, metric sets).
  3. Only then, do test assertions.

This model is, IMHO, less error prone, as mentioned in the PR description. However, that step 2 is in this case way slower because the only "API available on the public API" to know that we've gotten the breakdown metrics for all ended transactions and spans is "wait for the next metricsInterval". I think eventually we'll want our agent.flush() or similar API to improve on this (#2294).

So yes, this adds ~14s to our test run of ~15 minutes (in CI: https://apm-ci.elastic.co/job/apm-agent-nodejs/job/apm-agent-nodejs-mbp/job/master/), which I think should be okay. Eventually if/when we can move to tap and parallel running of test files, then this will not be a slowdown of the overall test run time at all.

@trentm trentm merged commit 5c3d1b1 into master Sep 7, 2021
@trentm trentm deleted the trentm/test-refactor-breakdown-test branch September 7, 2021 23:05
dgieselaar pushed a commit to dgieselaar/apm-agent-nodejs that referenced this pull request Sep 10, 2021
…ls (elastic#2316)

This refactors breakdown.test.js to not need a resetAgent(...) that knows
so much about Agent internals. Now that we have a usable agent.destroy(),
use that and create a new Agent for each test case.

Unfortunately the change results in this test file running much slower.
See PR discussion for justification.

Also, protect against against a possible crash:
    TypeError: Cannot read property 'sendMetricSet' of null
in metrics/reporter.js if `agent.destroy()` has been called.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agent-nodejs Make available for APM Agents project planning.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants