Skip to content

Add transaction and span outcome gherkin feature file#402

Merged
estolfo merged 10 commits intoelastic:masterfrom
estolfo:transaction-span-outcome-gherkin
Feb 15, 2021
Merged

Add transaction and span outcome gherkin feature file#402
estolfo merged 10 commits intoelastic:masterfrom
estolfo:transaction-span-outcome-gherkin

Conversation

@estolfo
Copy link
Copy Markdown
Contributor

@estolfo estolfo commented Jan 14, 2021

I thought some ambiguities in the spec prose could be cleared up by some Gherkin specs.
I've implemented them in the ruby agent in this pull request.


Few notable changes:

  • gRPC status UNKNOWN should be treated as a failure because it's an error or unknown type, but still an error
  • some gRPC status codes should have a different meaning on server or client side, in a similar way as the HTTP codes do
  • the default value for outcome should be set to failure if an error has been reported, success otherwise, this of course should apply with lower priority than the protocol-specific mappings. This allows to cover most other types of spans/transactions without any extra effort: db spans, messaging spans & transactions, ...

@estolfo estolfo requested a review from SylvainJuge January 14, 2021 15:03
@ghost
Copy link
Copy Markdown

ghost commented Jan 14, 2021

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview

Expand to view the summary

Build stats

  • Build Cause: Pull request #402 updated

  • Start Time: 2021-02-15T07:56:01.697+0000

  • Duration: 5 min 50 sec

  • Commit: 2471084

Trends 🧪

Image of Build Times

Steps errors 3

Expand to view the steps failures

Prepare gherkin changes for apm-agent-java
  • Took 0 min 3 sec . View more details on here
  • Description: .ci/scripts/prepare-spec-changes.sh "apm-agent-java" "gherkin" "apm-agent-core/src/test/resources/specs"
Prepare gherkin changes for apm-agent-python
  • Took 0 min 2 sec . View more details on here
  • Description: .ci/scripts/prepare-spec-changes.sh "apm-agent-python" "gherkin" "tests/bdd/features"
Prepare gherkin changes for apm-agent-ruby
  • Took 0 min 1 sec . View more details on here
  • Description: .ci/scripts/prepare-spec-changes.sh "apm-agent-ruby" "gherkin" "features"

@SylvainJuge
Copy link
Copy Markdown
Member

Thanks @estolfo for adding this shared spec.

I feel that we might get better adoption for shared specs by keeping them as short and as concise as possible.
Do you mind if we simplify a bit the structure, for example by using scenario outlines and examples ?

@estolfo
Copy link
Copy Markdown
Contributor Author

estolfo commented Jan 18, 2021

@SylvainJuge Sure, simpler is always better 😃
Can you give me an example of what you mean by an outline and example?

@SylvainJuge
Copy link
Copy Markdown
Member

SylvainJuge commented Feb 1, 2021

Hi @estolfo , I finally managed to make progress on this.

What I meant by using Scenario outline is doing something like this to test the HTTP codes mapping:

  @http
  Scenario Outline: HTTP transaction and span outcome
    Given an HTTP transaction with <status> response code
    Then transaction outcome is "<server>"
    Given an HTTP span with <status> response code
    Then span outcome is "<client>"
    Examples:
      | status | client  | server  |
      | 100    | success | success |
      | 200    | success | success |
      | 300    | success | success |
      | 400    | failure | success |
      | 404    | failure | success |
      | 500    | failure | failure |
      | -1     | failure | failure |
      # last row with negative status represents the case where the status is not available
      # for example when an exception/error is thrown without status (IO error, redirect loop, ...)

Here is the full file I'm using to test the Java agent implementation.
There are a few gaps in the spec that were not covered that I thought would be relevant to add:

  • db spans (was already covered by the dotnet agent)
  • gRPC status codes mapping with something close to what we have with HTTP client/server (will likely require some sync with other agents).

Also, the usage of tags like @http and @grpc allow to filter which parts to include or exclude, which makes it possible to only test for gRPC related features when testing gRPC.

@estolfo estolfo requested review from a team as code owners February 3, 2021 15:18
@SylvainJuge
Copy link
Copy Markdown
Member

One notable thing that might require a bit of discussion is that we think that having outcome = failure whenever an error is reported (and success otherwise) for a given span/transaction seems a fair default when there is no protocol-specific mapping.

For example, that would cover db spans and messaging spans & transactions without much effort.

Copy link
Copy Markdown
Contributor

@eyalkoren eyalkoren left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the general guideline for transactions:

For other protocols, we can default to the following behavior:
- `failure` when an error is reported
- `success` otherwise

I wonder why this cannot be applied generically to spans as well? I understand the higher importance of setting an outcome to exit spans (which I think should be listed as explicit list of span types, so that all agents are aligned), but from the spec I am not sure what happens with all other spans - maybe they are irrelevant because there is no use for their outcome? If this is the case, the spec should say that. Ideally, whenever I end a span, I should look in the spec and know what I need to do. For example, one of these two options:

  1. If we decide that span outcome is only relevant for exit spans, list the span types that are considered exit spans. Any exit span should never have an unknown outcome and any non-exit span should always have unknown outcome.
  2. If we decide that span outcome is relevant for non-exit span, can we decide that no built-in span may have an unknown outcome? Meaning - it either rely on some specific criteria (like HTTP code or DB response) or based on reported error for this span.

In addition, I added some questions about the gRPC outcomes based on the spec. These are not suggestions for change, only points I thought worth a second look.

Comment on lines +21 to +31
Scenario: span with error
Given an agent
And an active span
And span terminates with an error
Then span outcome is 'failure'

Scenario: span without error
Given an agent
And an active span
And span terminates without error
Then span outcome is 'success'
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me for any type of span, but the written spec allows unknown (or even null), with a note about the importance of exit spans. Does this depend on the span type?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's a good question here:

  • the exit spans are the ones that call a third party in the map, for those we MUST have an outcome in 99% of cases
  • we can provide a list of span types that are exit spans (db, messaging, ...)

However, adding the type here would mean to either duplicate those statements for each type or use a scenario outline (with examples) to test for those, and to me applying this rule also works with other types of spans:

  • we might have non-exit spans that may trigger errors (like the ones captured through annotations in Java): for those reporting exceptions and mark the span as an error would also make sense, even if not an exit span.
  • we have some spans (like inferred spans) which will never have a known outcome, and for those it might make sense to always have unknown.

Here we only test for the cases for which the outcome is known from the presence/absence of error, but it does not mean that we can't have unknown outcomes.

@SylvainJuge
Copy link
Copy Markdown
Member

Also, there is one minor thing that we might change in future specs, the Given: an agent statement is currently used to provide a hook to initialize the agent in every test scenario.

In Java, unlike in Ruby, we don't test the whole agent as doing this would require proper packaging and would move those tests in the integration test category, which run later in the build pipeline and are thus a high-latency feedback loop. Thus, to make those specs more relevant we only test the relevant parts of the agent like a regular unit-test.

As a result, we have different instructions to setup the agent for each set of features we want to test (currently API token/secret auth & outcome), and we can't have a common implementation for Given: an agent, we use an before hook instead with a no-op implementation for Given: an agent spec statement.

@SylvainJuge SylvainJuge requested a review from eyalkoren February 8, 2021 15:01
@SylvainJuge
Copy link
Copy Markdown
Member

Updated with changes after initial review by @eyalkoren :

  • improved gRPC mapping for gRPC transactions
  • simpler definition of span outcome similar to transactions.

Copy link
Copy Markdown
Contributor

@eyalkoren eyalkoren left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the clarifications on the spans outcome spec ❤️
LGTM.

SylvainJuge and others added 2 commits February 9, 2021 09:04
Co-authored-by: eyalkoren <41850454+eyalkoren@users.noreply.github.com>
Co-authored-by: Emily S <emily.s@elastic.co>
@SylvainJuge
Copy link
Copy Markdown
Member

@axw Given your knowledge of gRPC, do you have any opinion on the mapping proposal here ? I remember the Go agent current implementation maps gRPC UNKNOWN status to unknown outcome (which does not fit this change proposal).

@axw
Copy link
Copy Markdown
Member

axw commented Feb 10, 2021

@axw Given your knowledge of gRPC, do you have any opinion on the mapping proposal here ? I remember the Go agent current implementation maps gRPC UNKNOWN status to unknown outcome (which does not fit this change proposal).

I think that was a mistake, and should be treated as a bug in the Go agent. The mapping here makes sense to me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants