feat: support running k8s autodiscover suite for Beats PRs and local repositories by mdelapenya · Pull Request #1115 · elastic/e2e-testing

mdelapenya · 2021-04-30T18:44:37Z

What does this PR do?

It adds same logic as in other suites to detect if we are working with a build triggered by Beats (a merge commit, or a PR) or if a developer is using the BEATS_LOCAL_PATH to run the tests for the locally built artefacts.

With that in mind, if we are using 1) CI snapshots or 2) testing local artifacts, we will 1) download the TAR file representing the Docker image for a given commit, or 2) use the local filesystem to get an URI to the local file.

Once we have the TAR.GZ file, we will load into the current Docker host, tagging the image, and then loading into kind using kind load docker-image command. This is needed because the k8s node provided by kind does not know anything about the images loaded into the Docker host. Thankfully, kind is usable enough to support loading images in this manner.

Finally, we are avoiding the download of the binary from the bucket with every scenario, but instead caching the beat version for a beat. This way we will skip the download&tag&load process, reducing the build time and the amount of bytes downloaded from GCP Storage.

Why is it important?

This PR will allow Beats PRs and merges to run this test suite. Besides that, Beats developers will be able to run the e2e tests for k8s autodiscover from their local machines against their local binaries (Docker images produced by Beats' build system).

Checklist

My code follows the style guidelines of this project
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
I have made corresponding change to the default configuration files
I have added tests that prove my fix is effective or that my feature works
I have run the Unit tests for the CLI, and they are passing locally
I have run the End-2-End tests for the suite I'm working on, and they are passing locally
I have noticed new Go dependencies (run make notice in the proper directory)

How to test this PR locally

Running the tests for the f1fea95e8d44b5b6c45a7d25026e6276f7248456 commit:

$ SUITE="kubernetes-autodiscover" TAGS="filebeat" TIMEOUT_FACTOR=3 BEATS_USE_CI_SNAPSHOTS=true GITHUB_CHECK_SHA1=f1fea95e8d44b5b6c45a7d25026e6276f7248456 LOG_LEVEL=TRACE BEATS_USE_CI_SNAPSHOTS=true ELASTIC_APM_ACTIVE=false DEVELOPER_MODE=true make -C e2e functional-test

Running the tests for a version:

SUITE="kubernetes-autodiscover" TAGS="filebeat" TIMEOUT_FACTOR=3 LOG_LEVEL=TRACE ELASTIC_APM_ACTIVE=false DEVELOPER_MODE=true make -C e2e functional-test

Related issues

Follow-ups

@adam-stokes we should simplify the usage of env vars: I can imagine removing BEATS_USE_CI_SNAPSHOTS: if GITHUB_CHECK_SHA1 exists, we are using CI snapshots :)

elasticmachine · 2021-04-30T18:49:41Z

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS

Expand to view the summary

Build stats

Build Cause: Pull request #1115 updated
Start Time: 2021-05-03T12:19:56.733+0000
Duration: 21 min 35 sec
Commit: 7285dda

Test stats 🧪

Test	Results
Failed	0
Passed	152
Skipped	0
Total	152

Trends 🧪

💚 Flaky test report

Tests succeeded.

Expand to view the summary

Test stats 🧪

Test	Results
Failed	0
Passed	152
Skipped	0
Total	152

mdelapenya · 2021-04-30T18:54:44Z

e2e/_suites/kubernetes-autodiscover/autodiscover_test.go

-const defaultEventsWaitTimeout = 120 * time.Second
-const defaultDeployWaitTimeout = 120 * time.Second
+
+var defaultEventsWaitTimeout = 60 * time.Second


Using test framework default timeout factor of 3 (5 on CI)

This reverts commit a89325c.

e2e/_suites/kubernetes-autodiscover/autodiscover_test.go

jsoriano

LGTM, thanks!

jsoriano · 2021-05-03T15:54:45Z

e2e/_suites/kubernetes-autodiscover/autodiscover_test.go

+	// initialise timeout factor
+	common.TimeoutFactor = shell.GetEnvInteger("TIMEOUT_FACTOR", common.TimeoutFactor)


Nit. Do this initialization in init() in the common package?

Mmm, but then it's not ensured that the init method will be called in the proper order, right? IIRC Golang is not deterministic when talking about init functions order

If we still want to do so, I'd move this file's init code to the beforeSuite phase.

Mmm, but then it's not ensured that the init method will be called in the proper order, right? IIRC Golang is not deterministic when talking about init functions order

There are some guarantees in order, for example all the inits in an imported package are executed before continuing, or if a file has multiple inits, they are executed in order of definition.
For this case, if init() is defined in common, any package importing common will have this init() executed before running its own code.

…repositories (elastic#1115) * chore: add license * chore: initialise configurations before test suite * chore: use timeout_factor from env * fix: tell kind to skip pulling beats images * chore: add a method to load images into kind * feat: support running k8s autodiscover for Beats PRs or local filesystem * chore: add license header * chore: expose logger and use it, simplifying initialisation * fix: only run APM services for local APM environment * Revert "chore: expose logger and use it, simplifying initialisation" This reverts commit a89325c. * chore: log scenario name * fix: always cache beat version for podName * chore: reduce log level Co-authored-by: Adam Stokes <51892+adam-stokes@users.noreply.github.com>

* feat: support building centos/debian Docker images in multiplatform format (#1091) * chore: copy Centos+Systemd Dockerfile from origin See https://github.com/CentOS/CentOS-Dockerfiles/tree/master/systemd/centos7 * chore: copy debian+systemd Dockerfile from origin See https://github.com/alehaa/docker-debian-systemd * chore: add script to build&push ARCH-based images for centos and debian * chore: add script to push the multiplatform manifest for centos and debian This script leverages infra's tool to write the manifest, which needs to be ran right after the images have been built and pushed. Therefore, the tool will write the manifest for both platforms (AMD/ARM), inspecting the existing platform-specific repositories, combining them into the target. FYI, the '-ARCH' placeholder will be replaced with the values in the '--platforms' argument * chore: add regular pipeline to build the docker images * fix: default arch variable value * chore: abstract image name from file system * chore: couple agent's base box with stack platform * Move kubernetes/kubectl/kind code to internal project layout (#1092) This is mainly a cleanup to keep all internal related code that could be reusable in our `internal` directory layout. Next steps would be to take what's in `internal/kubectl` and merge with this code. Signed-off-by: Adam Stokes <51892+adam-stokes@users.noreply.github.com> * feat: bootstrap fleet-server for the deployment of regular elastic-agents (#1078) * chore: provide a fleet-server base image based on centos/debian with systemd * WIP * fix: remove duplicated fields after merge conflicts * fix: update method call after merge conflicts * chore: extract service name calculation to a method * chore: extract container name calculation to a method * chore: refactor get container name method * chore: refactor method even more * chore: use installer state to retrieve container name * chore: use installer when calculating service name * fix: adapt service names for fleet server * chore: enrich log when creating an installer * fix: use fleet server host when creating fleet config * fix: use https when connecting to fleet-server It's creating its own self-signed certs * feat: bootstrap a fleet server before a regular agent is deployed to fleet It will define the server host to be used when enrolling agents * fix: use fleet policy for agents, not the server one * fix: get different installers for fleet-server and agents * fix: use the old step for deploying regular agents * chore: rename variable with consistent name * chore: rename fleet-server scenario * fix: use proper container name for standalone mode * chore: save two variables * chore: rename standalone scenario for bootstrapping fleet-server * chore: rename bootstrap methods * chore: encapsulate bootstrap fleet-server logic * Update fleet.go * chore: remove Fleet Server CI parallel execution * chore: remove feature file for fleet-server * chore: boostrap fleet server only once We want to have it bootstrapped for the entire test suite, not for each scenario * fix: an agent was needed when adding integrations Co-authored-by: Adam Stokes <51892+adam-stokes@users.noreply.github.com> * apm-server tests (#1083) * some tests for apm-server * clean op dir on init instead of after * fix agent uninstall (#1111) * Auto bootstrap fleet during initialize scenario (#1116) Signed-off-by: Adam Stokes <51892+adam-stokes@users.noreply.github.com> Co-authored-by: Manuel de la Peña <mdelapenya@gmail.com> * feat: support running k8s autodiscover suite for Beats PRs and local repositories (#1115) * chore: add license * chore: initialise configurations before test suite * chore: use timeout_factor from env * fix: tell kind to skip pulling beats images * chore: add a method to load images into kind * feat: support running k8s autodiscover for Beats PRs or local filesystem * chore: add license header * chore: expose logger and use it, simplifying initialisation * fix: only run APM services for local APM environment * Revert "chore: expose logger and use it, simplifying initialisation" This reverts commit a89325c. * chore: log scenario name * fix: always cache beat version for podName * chore: reduce log level Co-authored-by: Adam Stokes <51892+adam-stokes@users.noreply.github.com> Co-authored-by: Adam Stokes <51892+adam-stokes@users.noreply.github.com> Co-authored-by: Juan Álvarez <juan.alvarez@elastic.co>

* master: chore: update mergify's titles (elastic#1142) chore: remove unused pipelines (elastic#1143) feat: use Docker copy to transfer binaries to containers (elastic#1136) chore: abstract image pulling (elastic#1137) Support multiple deployment backends (elastic#1130) chore: remove unused code (elastic#1119) Unify fleet and stand-alone suites (elastic#1112) Pull fresh docker images before suite (elastic#1123) chore: initialise timeout factor next to the declaration (elastic#1118) chore: match Go version with Beats (elastic#1120) feat: support running k8s autodiscover suite for Beats PRs and local repositories (elastic#1115) Auto bootstrap fleet during initialize scenario (elastic#1116) Kubernetes Deployment (elastic#1110)

…repositories (elastic#1115) * chore: add license * chore: initialise configurations before test suite * chore: use timeout_factor from env * fix: tell kind to skip pulling beats images * chore: add a method to load images into kind * feat: support running k8s autodiscover for Beats PRs or local filesystem * chore: add license header * chore: expose logger and use it, simplifying initialisation * fix: only run APM services for local APM environment * Revert "chore: expose logger and use it, simplifying initialisation" This reverts commit a89325c. * chore: log scenario name * fix: always cache beat version for podName * chore: reduce log level Co-authored-by: Adam Stokes <51892+adam-stokes@users.noreply.github.com>

* Move kubernetes/kubectl/kind code to internal project layout (#1092) This is mainly a cleanup to keep all internal related code that could be reusable in our `internal` directory layout. Next steps would be to take what's in `internal/kubectl` and merge with this code. Signed-off-by: Adam Stokes <51892+adam-stokes@users.noreply.github.com> * feat: bootstrap fleet-server for the deployment of regular elastic-agents (#1078) * chore: provide a fleet-server base image based on centos/debian with systemd * WIP * fix: remove duplicated fields after merge conflicts * fix: update method call after merge conflicts * chore: extract service name calculation to a method * chore: extract container name calculation to a method * chore: refactor get container name method * chore: refactor method even more * chore: use installer state to retrieve container name * chore: use installer when calculating service name * fix: adapt service names for fleet server * chore: enrich log when creating an installer * fix: use fleet server host when creating fleet config * fix: use https when connecting to fleet-server It's creating its own self-signed certs * feat: bootstrap a fleet server before a regular agent is deployed to fleet It will define the server host to be used when enrolling agents * fix: use fleet policy for agents, not the server one * fix: get different installers for fleet-server and agents * fix: use the old step for deploying regular agents * chore: rename variable with consistent name * chore: rename fleet-server scenario * fix: use proper container name for standalone mode * chore: save two variables * chore: rename standalone scenario for bootstrapping fleet-server * chore: rename bootstrap methods * chore: encapsulate bootstrap fleet-server logic * Update fleet.go * chore: remove Fleet Server CI parallel execution * chore: remove feature file for fleet-server * chore: boostrap fleet server only once We want to have it bootstrapped for the entire test suite, not for each scenario * fix: an agent was needed when adding integrations Co-authored-by: Adam Stokes <51892+adam-stokes@users.noreply.github.com> * apm-server tests (#1083) * some tests for apm-server * clean op dir on init instead of after * fix agent uninstall (#1111) * Kubernetes Deployment (#1110) * Kubernetes Deployment Signed-off-by: Adam Stokes <51892+adam-stokes@users.noreply.github.com> * Expose hostPort for kibana, elasticsearch, fleet without needing ingress This is nice for local development where you don't need an ingress and are relatively sure that the host system has the required ports available to bind to. Signed-off-by: Adam Stokes <51892+adam-stokes@users.noreply.github.com> * Auto bootstrap fleet during initialize scenario (#1116) Signed-off-by: Adam Stokes <51892+adam-stokes@users.noreply.github.com> Co-authored-by: Manuel de la Peña <mdelapenya@gmail.com> * feat: support running k8s autodiscover suite for Beats PRs and local repositories (#1115) * chore: add license * chore: initialise configurations before test suite * chore: use timeout_factor from env * fix: tell kind to skip pulling beats images * chore: add a method to load images into kind * feat: support running k8s autodiscover for Beats PRs or local filesystem * chore: add license header * chore: expose logger and use it, simplifying initialisation * fix: only run APM services for local APM environment * Revert "chore: expose logger and use it, simplifying initialisation" This reverts commit a89325c. * chore: log scenario name * fix: always cache beat version for podName * chore: reduce log level Co-authored-by: Adam Stokes <51892+adam-stokes@users.noreply.github.com> * chore: initialise timeout factor next to the declaration (#1118) * chore: initialise timeout factor on its own package * chore: reuse timeout factor from common * Unify fleet and stand-alone suites (#1112) * fix agent uninstall * unify fleet and stand alone suites * move things around a bit more * fixe bad merge * simplify some things * chore: remove unused code (#1119) * chore: remove unused code * chore: remove all references to fleet server hostname Because we assume it's a runtime dependency, provided by the initial compose file, we do not need to calculate service names, or URIs for the fleet-service endpoint. Instead, we assume it's listening in the 8220 port in the "fleet-server" hostname, which is accessible from the network created by docker-compose. * fix: use HTTP to connect to fleet-server * chore: remove fleet server policy code We do not need it anymore, as the fleet server is already bootstrapped * chore: remove all policies but system and fleet_server * Update policies.go * Update fleet.go * Update stand-alone.go Co-authored-by: Adam Stokes <51892+adam-stokes@users.noreply.github.com> * Support multiple deployment backends (#1130) * Abstract out deployment Provides ability to plugin different deployment backends for use in testing. Current deployment backends supported are "docker" and "kubernetes" * remove unused import * remove unsetting of fleet server hostname as it's not needed * add deployer support to stand-alone * add elastic-agent to k8s deployment specs * Update internal/docker/docker.go Signed-off-by: Adam Stokes <51892+adam-stokes@users.noreply.github.com> Co-authored-by: Manuel de la Peña <mdelapenya@gmail.com> * fix: bump stale agent version to 7.12-snapshot * chore: abstract process checks to the deployer (#1156) * chore: abstract process checks to the deployer * chore: rename variable in log entry * docs: improve comment * fix: go-fmt * feat: simplify the initialisation of versions (#1159) * chore: use fixed version in shell scripts * chore: move retry to utils We could move it to its own package, but at this moment it's very small * chore: initialise stackVesion at one single place * chore: initialise agent version base at one single place * chore: initialise agent version at one single place * chore: reduce the number of requests to Elastic's artifacts endpoint * chore: rename AgentVersionBase variable to BeatVersionBase * chore: rename AgentVersion variable to BeatVersion * chore: use Beat version in metricbeat test suite * chore: check if the version must use the fallback after coming from a Git SHA * feat: support flavours in services, specially in the elastic-agent (#1162) * chore: move compose to deploy package * feat: use a ServiceRequest when adding services * feat: add service flavour support * chore: remove unused centos/debian services * fixup: add service flavour * chore: move docker client to the deploy package We will need another abstraction to represent the Docker client operations, as it's clear what is a deployment and what is an operation in the deployment. Maybe a Client struct for each provider will help out in differenciate it * chore: use ServiceRequest everywhere * chore: run agent commands with a ServiceRequest * chore: use ServiceRequest in metricbeat test suite * chore: pass flavours to installers * chore: add a step to install the agent for the underlying OS * chore: always add flavour * fix: use installer for fleet_mode when removing services at the end of the scenario * fix: update broken references in metricbeat test suite * fix: update broken references in helm test suite * fix: standalone does not have an installer * fix: use service instead of image to get a service request for the agent * feat: support for scaling services in compose * fix: run second agent using compose scale option * fix: update kibana's default Docker namespace * feat: make a stronger verification of fleet-server being bootstrapped (#1164) * fix: resolve issues in k8s-autodiscover test suite (#1171) * chore: use timeout factor when tagging docker images * fix: resolve alias version in k8s-autodiscover test suite * fix: use common versions for k8s-autodiscover * fix: update background processes to 2 instances Co-authored-by: Adam Stokes <51892+adam-stokes@users.noreply.github.com> Co-authored-by: Juan Álvarez <juan.alvarez@elastic.co>

mdelapenya added 6 commits April 30, 2021 20:32

chore: add license

4b3cbcc

chore: initialise configurations before test suite

60fc387

chore: use timeout_factor from env

ba6a90b

fix: tell kind to skip pulling beats images

fb6b9c0

chore: add a method to load images into kind

a5b25f9

feat: support running k8s autodiscover for Beats PRs or local filesystem

f32a1fb

mdelapenya self-assigned this Apr 30, 2021

mdelapenya requested review from a team, ChrsMark and jsoriano April 30, 2021 18:44

chore: add license header

30b5d8d

mdelapenya commented Apr 30, 2021

View reviewed changes

chore: expose logger and use it, simplifying initialisation

a89325c

adam-stokes approved these changes Apr 30, 2021

View reviewed changes

mdelapenya and others added 4 commits May 1, 2021 08:45

fix: only run APM services for local APM environment

4b49b0b

Revert "chore: expose logger and use it, simplifying initialisation"

ca791be

This reverts commit a89325c.

Merge branch 'master' into k8s-autodiscover-beats

f9b97f8

chore: log scenario name

c2f9b1d

cachedout approved these changes May 3, 2021

View reviewed changes

fix: always cache beat version for podName

4e7c273

mdelapenya marked this pull request as ready for review May 3, 2021 09:58

mdelapenya commented May 3, 2021

View reviewed changes

e2e/_suites/kubernetes-autodiscover/autodiscover_test.go Outdated Show resolved Hide resolved

mdelapenya and others added 2 commits May 3, 2021 12:30

chore: reduce log level

3861cd7

Merge branch 'master' into k8s-autodiscover-beats

7285dda

adam-stokes merged commit f258414 into elastic:master May 3, 2021

jsoriano reviewed May 3, 2021

View reviewed changes

mdelapenya deleted the k8s-autodiscover-beats branch May 4, 2021 08:47

mdelapenya mentioned this pull request May 4, 2021

chore: initialise timeout factor next to the declaration #1118

Merged

8 tasks

mdelapenya mentioned this pull request May 4, 2021

chore: backports for 7.x #1126

Merged

8 tasks

mdelapenya mentioned this pull request May 17, 2021

chore: backports for 7.13.x branch #1178

Merged

9 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: support running k8s autodiscover suite for Beats PRs and local repositories#1115

feat: support running k8s autodiscover suite for Beats PRs and local repositories#1115
adam-stokes merged 15 commits intoelastic:masterfrom
mdelapenya:k8s-autodiscover-beats

mdelapenya commented Apr 30, 2021 •

edited

Loading

Uh oh!

elasticmachine commented Apr 30, 2021 •

edited

Loading

Build stats

Test stats 🧪

Trends 🧪

Test stats 🧪

Uh oh!

mdelapenya Apr 30, 2021

Uh oh!

Uh oh!

jsoriano left a comment

Uh oh!

jsoriano May 3, 2021

Uh oh!

mdelapenya May 4, 2021

Uh oh!

mdelapenya May 4, 2021

Uh oh!

jsoriano May 4, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

		// initialise timeout factor
		common.TimeoutFactor = shell.GetEnvInteger("TIMEOUT_FACTOR", common.TimeoutFactor)

Conversation

mdelapenya commented Apr 30, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Why is it important?

Checklist

How to test this PR locally

Related issues

Follow-ups

Uh oh!

elasticmachine commented Apr 30, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

💚 Build Succeeded

Build stats

Test stats 🧪

Trends 🧪

💚 Flaky test report

Test stats 🧪

Uh oh!

mdelapenya Apr 30, 2021

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jsoriano left a comment

Choose a reason for hiding this comment

Uh oh!

jsoriano May 3, 2021

Choose a reason for hiding this comment

Uh oh!

mdelapenya May 4, 2021

Choose a reason for hiding this comment

Uh oh!

mdelapenya May 4, 2021

Choose a reason for hiding this comment

Uh oh!

jsoriano May 4, 2021

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

mdelapenya commented Apr 30, 2021 •

edited

Loading

elasticmachine commented Apr 30, 2021 •

edited

Loading