Skip to content

Benchmark 2.0 production ready #7540

@simitt

Description

@simitt

In #7216 @marclop worked on a POC for a new benchmarking tool for APM data ingestion. The POC is leveraging Opbeans services to create apm events, which are recorded into ndjson files by the new Intake Receiver component. The apmbench loader reads from the ndjson files and the runner sends the data to an APM Server, while also collecting statistics which are indexed into a dedicated Elasticsearch. Find a more detailed description in #7216 (comment).

After successfully building the POC, all the components should be tied together in a maintainable way.

Goals

  • [state: on-track] New Benchmark tooling is wired up to run automated benchmarks on a fixed set of testdata on a regular schedule
  • [state: moved] Can be integrated into the teams workflow when building new features or refactoring (ad-hoc and automated benchmarks)
  • [state: moved] Can easily be used by agent developers to create a new testdata set for a new agent version for automated benchmarks, and ad-hoc benchmarks.
  • [state: moved] Can be used by engineers and developers outside the apm-server team for throughput ananlysis with a targeted throughput per unit.
  • [state: on-track] Update the processing & performance guide with updated numbers

Tasks

Intake Receiver (Generation)

None for this milestone

APM Bench (Execution)

apmbench will read the captured events from intake-receiver and load them in memory to allow them to be replayed against the APM Server. We need to modify the existing binary to allow us to measure changes to the APM Server over time.

ECE Test/ESS (Environment)

The new benchmarking framework is going to mainly use ESS testing regions to run the Elastic Stack with APM Server for the that we run the benchmarks and the deployment will be destroyed after the benchmark suite has finished.

This creates the opportunity to benchmark the APM Server with a very specific set of parameters and greatly facilitates ad-hoc benchmarking as well, since any APM Server developer can easily create the necessary pieces for the benchmarks, run them, and tear them down after they've finished. Additionally, we can leverage the existing ESS metric beat metrics that are collected by default and can be used to proactively debug or monitor the benchmarks when required.

The goal is to store the necessary scripts and automation that create the infrastructure in the apm-server repo and run the benchmarks through a Jenkins job that is scheduled to run daily.

Analysis

The analysis part comprises of loading the data into a remote long-lived Elasticsearch cluster, and build the necessary dashboards that cover the needed use cases.

Automation

The automation part should only require setting up the right accounts and credentials for the automation jobs to create or access the required infrastructure

Documentation

Ad Hoc

Metadata

Metadata

Type

No type
No fields configured for issues without a type.

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions