Description
The new benchmarking framework will take a slightly different route than our current approach, namely it will look to leverage either ESS (preferred) or an on-demand ECE environment (ecetest) to run nightly periodic benchmarks that allow us to track the APM Server performance.
The lowest effort would be to use a region in ESS where we can benchmark the APM Server's throughput, collect the results and index them to a long-lived remote Elasticsearch cluster with gobench. If there are unforeseen limitations when using ESS to run the benchmarks, we can then look into spinning up an on-demand ECE environment, but that is much costlier, in terms of time, resources and monetary cost.
Considerations
- Keep the benchmarks run by hey-apm for a while and only decommission those after we're happy with the new approach.
- Run
apmbench in a machine that is appropriate for the work and is as close as possible to the workload (same CSP region).
- The hardware profile must be able to handle a high level of concurrency since we'll be looking to run
apmbench with a medium to high number of agents, and decent network performance.
Approach
Docker image tag
Since the Elastic Stack and APM Server will be running in ESS, the software must be packaged in docker images, it is out of scope for this issue to build these images, but to reduce the risk of running the benchmarks against an upstream version that doesn't completely work, we should have some guarantees in place and a vetting process for the "latest" version.
Since we already have a workflow that updates each of the APM Server's active branches docker images, we could rely on the docker image tags that are used in our docker-compose.yml file and specify the current image's tag as the docker image to use in <elasticsearch|kibana|apm>.config.docker_image when creating the ESS deployment. See the Terraform provider acceptance test that uses docker_image.
Deployment lifecycle
The most cost effective and efficient approach is to create a new deployment in ESS and a new VM in the same region with the desired hardware profile for the apmbench runner and upload the credentials for apmbench to connect to the deployment. After the benchmarks have been run, and the results uploaded to a persistent deployment where we'd store them, the deployment and apmbench vm should be town down, to cut down costs.
The terraform configuration for the benchmark deployment could live in the APM Server repo and it could also be used for APM Server developers when there are changes in the APM Server that wish to be benchmarked, a limitation, however, would be that a cloud docker image would need to be built and uploaded to allow the testing to take place.
Automation work
Description
The new benchmarking framework will take a slightly different route than our current approach, namely it will look to leverage either ESS (preferred) or an on-demand ECE environment (ecetest) to run nightly periodic benchmarks that allow us to track the APM Server performance.
The lowest effort would be to use a region in ESS where we can benchmark the APM Server's throughput, collect the results and index them to a long-lived remote Elasticsearch cluster with
gobench. If there are unforeseen limitations when using ESS to run the benchmarks, we can then look into spinning up an on-demand ECE environment, but that is much costlier, in terms of time, resources and monetary cost.Considerations
apmbenchin a machine that is appropriate for the work and is as close as possible to the workload (same CSP region).apmbenchwith a medium to high number of agents, and decent network performance.Approach
Docker image tag
Since the Elastic Stack and APM Server will be running in ESS, the software must be packaged in docker images, it is out of scope for this issue to build these images, but to reduce the risk of running the benchmarks against an upstream version that doesn't completely work, we should have some guarantees in place and a vetting process for the "latest" version.
Since we already have a workflow that updates each of the APM Server's active branches docker images, we could rely on the docker image tags that are used in our
docker-compose.ymlfile and specify the current image's tag as the docker image to use in<elasticsearch|kibana|apm>.config.docker_imagewhen creating the ESS deployment. See the Terraform provider acceptance test that usesdocker_image.Deployment lifecycle
The most cost effective and efficient approach is to create a new deployment in ESS and a new VM in the same region with the desired hardware profile for the apmbench runner and upload the credentials for apmbench to connect to the deployment. After the benchmarks have been run, and the results uploaded to a persistent deployment where we'd store them, the deployment and apmbench vm should be town down, to cut down costs.
The terraform configuration for the benchmark deployment could live in the APM Server repo and it could also be used for APM Server developers when there are changes in the APM Server that wish to be benchmarked, a limitation, however, would be that a cloud docker image would need to be built and uploaded to allow the testing to take place.
Automation work