Collect aggregates from expvar metrices#8174
Conversation
|
This pull request does not have a backport label. Could you fix it @lahsivjar? 🙏
NOTE: |
e4ebb56 to
4753a44
Compare
4753a44 to
a0873dc
Compare
🌐 Coverage report
|
axw
left a comment
There was a problem hiding this comment.
@lahsivjar @marclop what do you think about making these metrics opt-in?
I fear the output will become unwieldy if we add so many metrics. Do you anticipate using these regularly, or only (interactively) after diagnosing a drop in throughput?
|
@axw I think that would be better and we should consider categorizing the metrics in essential and optional categories. This may improve the output and the mental strain of parsing all these number thrown at you. I should've added some of these ideas to the the original issue when I created it and start the discussion then. There are some metrics that aren't really that useful, particularly when the number is 0, one that comes to mind is Since we mimick the output of Of our custom metrics, we don't really need all of them by default, if we're just doing some quick benchmarks, it may be better to limit the number of other metrics that are printed out. I believe that it would be more effective to print less metrics by default and omit those that are 0. the output of these could be as below (the final number of metrics to be printed is yet TBD) In case that there are some error responses, we could report those, but they needn't be included in all the results: We can introduce a new flag that when set, includes all the metrics that we're recording to be displayed I think that the usefulness of the broken down
|
|
Thanks for the proposal @marclop, that all sounds good to me. |
|
@marclop What is your recommendation on the extra metrics for Also, regarding |
There was a problem hiding this comment.
Overall approach looks good. I left a few comments on the overall watcher and map structure.
What is your recommendation on the extra metrics for rss, goroutines, indexers, gc and heap? I see in the table that you have only kept gc cycles and max rss. Is that all we want to keep?
No, I would like to keep all the metrics proposed previously, I just added them to the table to exemplify the addition of the new metrics in that table without writing the entire list.
Also, regarding max_rss, I am not sure if that will be enough to derive a lot of data from. IIUC, max_rss will also include the memory not released to OS and it might be possible that max_rss remains same throughout the benchmark process especially if the same apm-server process is used in multiple benchmarks.
From what I've been able to observe, the beats RSS is updated fairly regularly. In the current approach, the collector is created on each benchmark, and will be confined to the lifetime of that benchmark, meaning that the max RSS reading will only be for that benchmark (even if a benchmark that ran before did push the RSS considerably, the max reading will may be affected by it).
To provide an updated table, I think we should record the following:
Current
| metric name | Required | Optional |
|---|---|---|
| Benchmark Name | x | - |
| Iterations | x | - |
| ns/op | x | - |
| B/op | x | - |
| allocs/op | x | - |
| events/sec | x | - |
| metrics/sec | - | x |
| spans/sec | - | x |
| errors/sec | - | x |
| txs/sec | - | x |
New
| metric name | Required | Optional |
|---|---|---|
| gc_cycles | x | - |
| max_rss | x | - |
| max_goroutines | - | x |
| max_heap_alloc | - | x |
| max_heap_objects | - | x |
| mean_available_indexers | - | x |
Like my previous comment mentions, I think we should add a new flag (-detailed or a similar name that makes sense) to control this behavior.
|
This pull request is now in conflicts. Could you fix it @lahsivjar? 🙏 |
Co-authored-by: Marc Lopez Rubio <marc5.12@outlook.com>
|
/test |
Motivation/summary
Collect aggregated metrics from the result of expvar. Currently collecting max, min and average.
Checklist
- [ ] Update CHANGELOG.asciidoc- [ ] Update package changelog.yml (only if changes toapmpackagehave been made)- [ ] Documentation has been updatedHow to test these changes
Run a local or cloud APM-Server
Run
apmbenchtest with-detailedflagObserve the following fields in the output:
gc_cycles,max_rss,max_goroutines,max_heap_alloc,max_heap_objectsandmean_available_indexersRelated issues
Closes #7817