Skip to content

intake: Handle transaction.dropped_spans_stats#6200

Merged
marclop merged 7 commits intomasterfrom
f/add-transaction.dropped_spans_stats
Sep 23, 2021
Merged

intake: Handle transaction.dropped_spans_stats#6200
marclop merged 7 commits intomasterfrom
f/add-transaction.dropped_spans_stats

Conversation

@marclop
Copy link
Copy Markdown
Contributor

@marclop marclop commented Sep 21, 2021

Motivation/summary

Adds a new transaction.dropped_spans_stats optional field to the API,
which accepts an array of spans which were due to transaction_max_spans
or exit_span_min_duration being exceeded. Additionally, the metrics
aggregator for the destination_service metricset has been updated to
process transaction where transaction.dropped_spans_stats > 0.

{
  "transaction": {
    "dropped_spans_stats": [
      {
        "type": "external",
        "subtype": "http",
        "destination_service_resource": "example.com:443",
        "outcome": "failure",
        "duration": {
            "count": 28,
            "sum.us": 123456
        }
      },
      {
        "type": "db",
        "subtype": "mysql",
        "destination_service_resource": "mysql",
        "outcome": "success",
        "duration": {
            "count": 81,
            "sum.us": 9876543
        }
      }
    ]
  }
}

Checklist

For functional changes, consider:

  • Is it observable through the addition of either logging or metrics?
  • Is its use being published in telemetry to enable product improvement?
  • Have system tests been added to avoid regression?

How to test these changes

TODO

Related issues

Closes #5850

Adds a new `transaction.dropped_spans_stats` optional field to the API,
which accepts an array of spans which were due to transaction_max_spans
or exit_span_min_duration being exceeded. Additionally, the metrics
aggregator for the `destination_service` metricset has been updated to
process transaction where `transaction.dropped_spans_stats > 0`.

```
{
  "transaction": {
    "dropped_spans_stats": [
      {
        "type": "external",
        "subtype": "http",
        "destination_service_resource": "example.com:443",
        "outcome": "failure",
        "count": 28,
        "duration.sum.us": 123456
      },
      {
        "type": "db",
        "subtype": "mysql",
        "destination_service_resource": "mysql",
        "outcome": "success",
        "count": 81,
        "duration.sum.us": 9876543
      }
    ]
  }
}
```

Signed-off-by: Marc Lopez Rubio <marc5.12@outlook.com>
@marclop marclop self-assigned this Sep 21, 2021
@mergify
Copy link
Copy Markdown
Contributor

mergify bot commented Sep 21, 2021

This pull request does not have a backport label. Could you fix it @marclop? 🙏
To fixup this pull request, you need to add the backport labels for the needed
branches, such as:

  • backport-7.x is the label to automatically backport to the 7.x branch.
  • backport-7./d is the label to automatically backport to the 7./d branch. /d is the digit

NOTE: backport-skip has been added to this pull request.

@mergify mergify bot added the backport-skip Skip notification from the automated backport with mergify label Sep 21, 2021
@marclop marclop added backport-7.x feature v7.16.0 and removed backport-skip Skip notification from the automated backport with mergify labels Sep 21, 2021
@ghost
Copy link
Copy Markdown

ghost commented Sep 21, 2021

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Start Time: 2021-09-23T09:58:02.058+0000

  • Duration: 39 min 46 sec

  • Commit: 220e086

Test stats 🧪

Test Results
Failed 0
Passed 6137
Skipped 14
Total 6151

🤖 GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

  • /hey-apm : Run the hey-apm benchmark.

  • /package : Generate and publish the docker images.

Adds a new system test `TestTransactionDroppedSpansStats` which ingests
the `testdata/intake-v2/transactions.ndjson` and checks that at least 3
documents are present in the backing Elasticsearch cluster.

Also fixes `testdata/intake-v2/transactions.ndjson` to have a unique
`transaction.id` and `trace.id` since it was colliding with the ids of
another event present in the test data.

Last, adjusts the number of transactions that the pipeline pytests are
waiting for from `5` to `6`.

Signed-off-by: Marc Lopez Rubio <marc5.12@outlook.com>
Comment thread systemtest/approvals.go
Comment thread systemtest/huge_traces_test.go
Comment thread systemtest/main_test.go
Signed-off-by: Marc Lopez Rubio <marc5.12@outlook.com>
@marclop marclop marked this pull request as ready for review September 22, 2021 06:06
@marclop marclop requested a review from a team September 22, 2021 06:06
Comment thread model/transaction.go Outdated
Comment on lines +165 to +166
Count *int
DurationSumUs *int // microseconds
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice if we could use the recently introduced model.AggregatedDuration hwere. I've asked about changing the fields to accommodate that: #5850 (comment)

Copy link
Copy Markdown
Member

@axw axw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! Let's update the model to use model.AggregatedSum now that Felix has agreed

Comment thread model/modeldecoder/v2/model.go Outdated
Comment thread systemtest/approvals.go
Comment thread systemtest/huge_traces_test.go Outdated
Comment thread systemtest/main_test.go
Signed-off-by: Marc Lopez Rubio <marc5.12@outlook.com>
Copy link
Copy Markdown
Member

@axw axw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice! LGTM, just a couple of suggestions.

Comment thread x-pack/apm-server/aggregation/spanmetrics/aggregator_test.go Outdated
Comment thread x-pack/apm-server/aggregation/spanmetrics/aggregator_test.go Outdated
Comment thread beater/tracing_test.go Outdated
Signed-off-by: Marc Lopez Rubio <marc5.12@outlook.com>
Signed-off-by: Marc Lopez Rubio <marc5.12@outlook.com>
Comment thread changelogs/head.asciidoc
- HTTP server errors (e.g. TLS handshake errors) are now logged {pull}6141[6141]
- Span documents now duplicate extended HTTP fields, which were previously only under `span.http.*`, under `http.*` {pull}6147[6147]
- We now record the direct network peer for incoming requests as `source.ip` and `source.port`; origin IP is recorded in `client.ip` {pull}6152[6152]
- We now collect span destination metrics for transactions with too many spans (for example due to transaction_max_spans or exit_span_min_duration) when collected and sent by APM agents {pull}6200[6200]
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added 2 entries, but I can keep maybe this one only, I think they're both relevant, but maybe it's too much for this change.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally I don't find much value in the "Intake API Changes" section, but I don't have a problem with it either. What you have added is fine.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Intake API changes are mainly ment to provide value to agent developers as reference which apm-server version supports which intake fields.

@marclop marclop merged commit 1213006 into master Sep 23, 2021
@marclop marclop deleted the f/add-transaction.dropped_spans_stats branch September 23, 2021 12:36
mergify bot pushed a commit that referenced this pull request Sep 23, 2021
Adds a new `transaction.dropped_spans_stats` optional field to the API,
which accepts an array of spans which were due to transaction_max_spans
or exit_span_min_duration being exceeded. Additionally, the metrics
aggregator for the `destination_service` metricset has been updated to
process transaction where `transaction.dropped_spans_stats > 0`.

```
{
  "transaction": {
    "dropped_spans_stats": [
      {
        "type": "external",
        "subtype": "http",
        "destination_service_resource": "example.com:443",
        "outcome": "failure",
        "duration": {
            "count": 28,
            "sum.us": 123456
        }
      },
      {
        "type": "db",
        "subtype": "mysql",
        "destination_service_resource": "mysql",
        "outcome": "success",
        "duration": {
            "count": 81,
            "sum.us": 9876543
        }
      }
    ]
  }
}
```

Signed-off-by: Marc Lopez Rubio <marc5.12@outlook.com>
(cherry picked from commit 1213006)

# Conflicts:
#	changelogs/head.asciidoc
marclop pushed a commit that referenced this pull request Sep 23, 2021
… (#6237)

Adds a new `transaction.dropped_spans_stats` optional field to the API,
which accepts an array of spans which were due to transaction_max_spans
or exit_span_min_duration being exceeded. Additionally, the metrics
aggregator for the `destination_service` metricset has been updated to
process transaction where `transaction.dropped_spans_stats > 0`.

```
{
  "transaction": {
    "dropped_spans_stats": [
      {
        "type": "external",
        "subtype": "http",
        "destination_service_resource": "example.com:443",
        "outcome": "failure",
        "duration": {
            "count": 28,
            "sum.us": 123456
        }
      },
      {
        "type": "db",
        "subtype": "mysql",
        "destination_service_resource": "mysql",
        "outcome": "success",
        "duration": {
            "count": 81,
            "sum.us": 9876543
        }
      }
    ]
  }
}
```

Signed-off-by: Marc Lopez Rubio <marc5.12@outlook.com>
(cherry picked from commit 1213006)
beniwohli added a commit to beniwohli/apm-agent-python that referenced this pull request Sep 23, 2021
beniwohli added a commit to elastic/apm-agent-python that referenced this pull request Oct 15, 2021
* move old-style to new-style span type conversion to capture_span

this allows us to use these values for dropped-span metrics without
having to store them on DroppedSpan, as they are available on
the capture_span object.

* move start/duration calculations to BaseSpan

for dropped-span metrics, we need to collect these values
for DroppedSpan as well, so we might as well remove the
duplication

* implement collection of dropped span statistics

* Update elasticapm/traces.py

Co-authored-by: Colton Myers <colton.myers@gmail.com>

* don't calculate and send dropped span statistics if server doesn't support it

* use correct format as defined in elastic/apm-server#6200

* remove type/subtype from dropped span statistics calculation

Co-authored-by: Colton Myers <colton.myers@gmail.com>
@marclop marclop removed their assignment Oct 25, 2021
@mergify
Copy link
Copy Markdown
Contributor

mergify bot commented Oct 25, 2021

This pull request does not have a backport label. Could you fix it @marclop? 🙏
To fixup this pull request, you need to add the backport labels for the needed
branches, such as:

  • backport-7.x is the label to automatically backport to the 7.x branch.
  • backport-7./d is the label to automatically backport to the 7./d branch. /d is the digit

NOTE: backport-skip has been added to this pull request.

@mergify mergify bot added the backport-skip Skip notification from the automated backport with mergify label Oct 25, 2021
@axw axw self-assigned this Oct 25, 2021
@axw
Copy link
Copy Markdown
Member

axw commented Oct 26, 2021

Verified with 7.16.0 BC1. I used the (unreleased) Go agent to generate some dropped spans:

package main

import (
        "time"

        "go.elastic.co/apm"
)

func main() {
        tracer := apm.DefaultTracer
        tracer.SetMaxSpans(1)
        tx := tracer.StartTransaction("name", "type")
        for i := 0; i < 10; i++ {
                span := tx.StartSpanOptions("name", "type", apm.SpanOptions{ExitSpan: true})
                span.Duration = 10 * time.Microsecond
                span.End()
        }
        tx.End()
        tracer.Flush(nil)
}

This results in 1 transaction and 1 span document. I confirmed that there are no new fields added to the transaction document relating to dropped spans. Finally, I checked that the dropped spans are included in the service destination metric document:

GET /metrics-apm*-*/_search
{
  "_source": false, 
  "fields": ["span.*"], 
  "query": {
    "term": {
      "metricset.name": {
        "value": "service_destination"
      }
    }
  }
}
{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 0.6931471,
    "hits" : [
      {
        "_index" : ".ds-metrics-apm.internal-default-2021.10.26-000001",
        "_type" : "_doc",
        "_id" : "5okru3wBjIGj2PVdXw7R",
        "_score" : 0.6931471,
        "fields" : {
          "span.destination.service.response_time.sum.us" : [
            100
          ],
          "span.destination.service.response_time.count" : [
            10
          ],
          "span.destination.service.resource" : [
            "type"
          ]
        }
      }
    ]
  }
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport-skip Skip notification from the automated backport with mergify feature test-plan test-plan-ok v7.16.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[spans dropped stats] Extend intake to support stats for dropped spans

3 participants