intake: Handle transaction.dropped_spans_stats#6200
Conversation
Adds a new `transaction.dropped_spans_stats` optional field to the API,
which accepts an array of spans which were due to transaction_max_spans
or exit_span_min_duration being exceeded. Additionally, the metrics
aggregator for the `destination_service` metricset has been updated to
process transaction where `transaction.dropped_spans_stats > 0`.
```
{
"transaction": {
"dropped_spans_stats": [
{
"type": "external",
"subtype": "http",
"destination_service_resource": "example.com:443",
"outcome": "failure",
"count": 28,
"duration.sum.us": 123456
},
{
"type": "db",
"subtype": "mysql",
"destination_service_resource": "mysql",
"outcome": "success",
"count": 81,
"duration.sum.us": 9876543
}
]
}
}
```
Signed-off-by: Marc Lopez Rubio <marc5.12@outlook.com>
|
This pull request does not have a backport label. Could you fix it @marclop? 🙏
NOTE: |
💚 Build Succeeded
Expand to view the summary
Build stats
Test stats 🧪
🤖 GitHub commentsTo re-run your PR in the CI, just comment with:
|
Adds a new system test `TestTransactionDroppedSpansStats` which ingests the `testdata/intake-v2/transactions.ndjson` and checks that at least 3 documents are present in the backing Elasticsearch cluster. Also fixes `testdata/intake-v2/transactions.ndjson` to have a unique `transaction.id` and `trace.id` since it was colliding with the ids of another event present in the test data. Last, adjusts the number of transactions that the pipeline pytests are waiting for from `5` to `6`. Signed-off-by: Marc Lopez Rubio <marc5.12@outlook.com>
Signed-off-by: Marc Lopez Rubio <marc5.12@outlook.com>
| Count *int | ||
| DurationSumUs *int // microseconds |
There was a problem hiding this comment.
It would be nice if we could use the recently introduced model.AggregatedDuration hwere. I've asked about changing the fields to accommodate that: #5850 (comment)
axw
left a comment
There was a problem hiding this comment.
Looks great! Let's update the model to use model.AggregatedSum now that Felix has agreed
Signed-off-by: Marc Lopez Rubio <marc5.12@outlook.com>
axw
left a comment
There was a problem hiding this comment.
Very nice! LGTM, just a couple of suggestions.
Signed-off-by: Marc Lopez Rubio <marc5.12@outlook.com>
Signed-off-by: Marc Lopez Rubio <marc5.12@outlook.com>
| - HTTP server errors (e.g. TLS handshake errors) are now logged {pull}6141[6141] | ||
| - Span documents now duplicate extended HTTP fields, which were previously only under `span.http.*`, under `http.*` {pull}6147[6147] | ||
| - We now record the direct network peer for incoming requests as `source.ip` and `source.port`; origin IP is recorded in `client.ip` {pull}6152[6152] | ||
| - We now collect span destination metrics for transactions with too many spans (for example due to transaction_max_spans or exit_span_min_duration) when collected and sent by APM agents {pull}6200[6200] |
There was a problem hiding this comment.
I added 2 entries, but I can keep maybe this one only, I think they're both relevant, but maybe it's too much for this change.
There was a problem hiding this comment.
Personally I don't find much value in the "Intake API Changes" section, but I don't have a problem with it either. What you have added is fine.
There was a problem hiding this comment.
The Intake API changes are mainly ment to provide value to agent developers as reference which apm-server version supports which intake fields.
Adds a new `transaction.dropped_spans_stats` optional field to the API,
which accepts an array of spans which were due to transaction_max_spans
or exit_span_min_duration being exceeded. Additionally, the metrics
aggregator for the `destination_service` metricset has been updated to
process transaction where `transaction.dropped_spans_stats > 0`.
```
{
"transaction": {
"dropped_spans_stats": [
{
"type": "external",
"subtype": "http",
"destination_service_resource": "example.com:443",
"outcome": "failure",
"duration": {
"count": 28,
"sum.us": 123456
}
},
{
"type": "db",
"subtype": "mysql",
"destination_service_resource": "mysql",
"outcome": "success",
"duration": {
"count": 81,
"sum.us": 9876543
}
}
]
}
}
```
Signed-off-by: Marc Lopez Rubio <marc5.12@outlook.com>
(cherry picked from commit 1213006)
# Conflicts:
# changelogs/head.asciidoc
… (#6237) Adds a new `transaction.dropped_spans_stats` optional field to the API, which accepts an array of spans which were due to transaction_max_spans or exit_span_min_duration being exceeded. Additionally, the metrics aggregator for the `destination_service` metricset has been updated to process transaction where `transaction.dropped_spans_stats > 0`. ``` { "transaction": { "dropped_spans_stats": [ { "type": "external", "subtype": "http", "destination_service_resource": "example.com:443", "outcome": "failure", "duration": { "count": 28, "sum.us": 123456 } }, { "type": "db", "subtype": "mysql", "destination_service_resource": "mysql", "outcome": "success", "duration": { "count": 81, "sum.us": 9876543 } } ] } } ``` Signed-off-by: Marc Lopez Rubio <marc5.12@outlook.com> (cherry picked from commit 1213006)
* move old-style to new-style span type conversion to capture_span this allows us to use these values for dropped-span metrics without having to store them on DroppedSpan, as they are available on the capture_span object. * move start/duration calculations to BaseSpan for dropped-span metrics, we need to collect these values for DroppedSpan as well, so we might as well remove the duplication * implement collection of dropped span statistics * Update elasticapm/traces.py Co-authored-by: Colton Myers <colton.myers@gmail.com> * don't calculate and send dropped span statistics if server doesn't support it * use correct format as defined in elastic/apm-server#6200 * remove type/subtype from dropped span statistics calculation Co-authored-by: Colton Myers <colton.myers@gmail.com>
|
This pull request does not have a backport label. Could you fix it @marclop? 🙏
NOTE: |
|
Verified with 7.16.0 BC1. I used the (unreleased) Go agent to generate some dropped spans: package main
import (
"time"
"go.elastic.co/apm"
)
func main() {
tracer := apm.DefaultTracer
tracer.SetMaxSpans(1)
tx := tracer.StartTransaction("name", "type")
for i := 0; i < 10; i++ {
span := tx.StartSpanOptions("name", "type", apm.SpanOptions{ExitSpan: true})
span.Duration = 10 * time.Microsecond
span.End()
}
tx.End()
tracer.Flush(nil)
}This results in 1 transaction and 1 span document. I confirmed that there are no new fields added to the transaction document relating to dropped spans. Finally, I checked that the dropped spans are included in the service destination metric document: {
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 0.6931471,
"hits" : [
{
"_index" : ".ds-metrics-apm.internal-default-2021.10.26-000001",
"_type" : "_doc",
"_id" : "5okru3wBjIGj2PVdXw7R",
"_score" : 0.6931471,
"fields" : {
"span.destination.service.response_time.sum.us" : [
100
],
"span.destination.service.response_time.count" : [
10
],
"span.destination.service.resource" : [
"type"
]
}
}
]
}
} |
Motivation/summary
Adds a new
transaction.dropped_spans_statsoptional field to the API,which accepts an array of spans which were due to transaction_max_spans
or exit_span_min_duration being exceeded. Additionally, the metrics
aggregator for the
destination_servicemetricset has been updated toprocess transaction where
transaction.dropped_spans_stats > 0.Checklist
For functional changes, consider:
How to test these changes
TODO
Related issues
Closes #5850