Skip to content
This repository was archived by the owner on Sep 17, 2024. It is now read-only.

Separate ES workload from Agent spec and make it req for the scenarios#2203

Merged
ChrsMark merged 6 commits intoelastic:mainfrom
ChrsMark:add_readyness_for_es
Mar 14, 2022
Merged

Separate ES workload from Agent spec and make it req for the scenarios#2203
ChrsMark merged 6 commits intoelastic:mainfrom
ChrsMark:add_readyness_for_es

Conversation

@ChrsMark
Copy link
Copy Markdown
Member

@ChrsMark ChrsMark commented Mar 3, 2022

What does this PR do?

Separate ES workload from Agent spec and make it a requirement for the scenarios so as to ensure that deadlines because of download time are not hidden inside the scenario execution.

Signed-off-by: ChrsMark <chrismarkou92@gmail.com>
@ChrsMark ChrsMark requested a review from mdelapenya March 3, 2022 13:18
@ChrsMark ChrsMark self-assigned this Mar 3, 2022
@mergify
Copy link
Copy Markdown
Contributor

mergify bot commented Mar 3, 2022

This pull request does not have a backport label. Could you fix it @ChrsMark? 🙏
To fixup this pull request, you need to add the backport labels for the needed
branches, such as:

  • backport-v./d./d./d is the label to automatically backport to the 7./d branch. /d is the digit
    NOTE: backport-skip has been added to this pull request.

@mergify mergify bot added the backport-skip Skip notification from the automated backport with mergify label Mar 3, 2022
Signed-off-by: ChrsMark <chrismarkou92@gmail.com>
@elasticmachine
Copy link
Copy Markdown
Contributor

elasticmachine commented Mar 3, 2022

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Start Time: 2022-03-14T11:51:00.503+0000

  • Duration: 44 min 54 sec

Test stats 🧪

Test Results
Failed 0
Passed 276
Skipped 0
Total 276

💚 Flaky test report

Tests succeeded.

🤖 GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

@mdelapenya
Copy link
Copy Markdown
Contributor

The errors are still there. If you think you need to wait before executing certain code, the tests allow you to wait for a certain condition, using a retry with backoff strategy, or more hard, using a Sleep.

@ChrsMark
Copy link
Copy Markdown
Member Author

ChrsMark commented Mar 4, 2022

@mdelapenya I didn't have the time to deal with it today but the thing is that running the suite locally it doesn't fail specially with the change applied on this PR. I believe that the issue in general is that we don't properly wait until ES is started properly. Since it seems that the init container does not really help here, do we have any specific alternative leveraging the framework in order to ensure that ES is up and running before starting the scenario? Sth like:

Scenario: Logs collection from running pod
  Given "elastic-agent" is running with "logs generic"
    And "elasticsearch" is running
   When "a pod" is deployed
   Then "elastic-agent" collects events with "kubernetes.pod.name:a-pod"

@mdelapenya
Copy link
Copy Markdown
Contributor

do we have any specific alternative leveraging the framework in order to ensure that ES is up and running before starting the scenario? Sth like:

Scenario: Logs collection from running pod
  Given "elastic-agent" is running with "logs generic"
    And "elasticsearch" is running
   When "a pod" is deployed
   Then "elastic-agent" collects events with "kubernetes.pod.name:a-pod"

Let me think about it 🤔 I'll fetch this branch and try locally

@mdelapenya
Copy link
Copy Markdown
Contributor

I think you are right, we could decouple the elasticsearch k8s resources (configMap, deployment and service) and start them before the elastic agent is started.

I have two ideas:

  1. rephrase your scenario like this (elasticsearch first):
Scenario: Logs collection from running pod
  Given "elasticsearch" is running
    And "elastic-agent" is running with "logs generic"
   When "a pod" is deployed
   Then "elastic-agent" collects events with "kubernetes.pod.name:a-pod"
  1. set elasticsearch as a required resource, and install it in the ctx.BeforeSuite methods, right after initialising the kind cluster. I like this approach the most, as it reflects a real scenario where the elasticsearch instance does exist before any agent connected to it. OTOH, this corner case represents a good question for the agent team: what happens to the agent if the elasticsearch is not there on time?

@ChrsMark
Copy link
Copy Markdown
Member Author

ChrsMark commented Mar 4, 2022

Thanks for looking into this!

I think I had tried already the option 1 but with no success. I will try to give it another shot.
Regarding option 2: Will this mean that ES will be installed once in the startup of the suite? If this is true then the various scenarios will share the same ES and this might mean that the data will also be shared along the runs which is something that we don't want since we might find fields to be present from previous scenarios will the current scenario is actually failing. Is this concern valid?

@mdelapenya
Copy link
Copy Markdown
Contributor

We may need to rewrite the scenario and leverage the different life cycle hooks:

  • move the elasticsearch installation to beforeSuite, so that it's started once
  • make sure the elasticsearch indices of interest are clean in the afterScenario, so that we are not adding dirty data affecting other tests
  • make sure the data is namespaced. Is it possible to rename the indices where the data is stored so that they not collide in future, possible parallel tests?

Signed-off-by: ChrsMark <chrismarkou92@gmail.com>
@ChrsMark
Copy link
Copy Markdown
Member Author

After checking all options I will go with the basic one for now which seems simpler. Changing the suite implementation seems to me a bit overkill since we are gonna to re-use the same utils that we already have and can just work well with the current implementation of the scenario. I still believe that we can reach deadlines if ES image takes too long to get downloaded but not sure how we can handle it better.

@ChrsMark ChrsMark changed the title Add init container in Agent Pod to verify ES readiness Split ES workload from Agent spec and make it req for the scenarios Mar 10, 2022
@ChrsMark ChrsMark changed the title Split ES workload from Agent spec and make it req for the scenarios Seperate ES workload from Agent spec and make it req for the scenarios Mar 10, 2022
@ChrsMark ChrsMark changed the title Seperate ES workload from Agent spec and make it req for the scenarios Separate ES workload from Agent spec and make it req for the scenarios Mar 10, 2022
Signed-off-by: ChrsMark <chrismarkou92@gmail.com>
Comment on lines +7 to +8
Given "elasticsearch" is running
And "elastic-agent" is running with "logs generic"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could have a single Given clause with:

Given "elastic-agent" is running with "logs generic"

and define a Background clause at the beginning of the feature file:

Background: Setting up elasticsearch
  Given "elasticsearch" is running

which will be run before every scenario. Therefore you can simplify each scenario removing the initial "Given elasticsearch" step

Signed-off-by: ChrsMark <chrismarkou92@gmail.com>
Signed-off-by: ChrsMark <chrismarkou92@gmail.com>
@mdelapenya
Copy link
Copy Markdown
Contributor

Great team work resolving the issue! Thanks for your time here

@ChrsMark ChrsMark merged commit 1e49c28 into elastic:main Mar 14, 2022
@ChrsMark ChrsMark added the backport-v8.2.0 Automated backport with mergify label Mar 14, 2022
@ChrsMark ChrsMark added the backport-v8.1.0 Automated backport with mergify label Mar 14, 2022
@mergify mergify bot removed the backport-skip Skip notification from the automated backport with mergify label Mar 14, 2022
mergify bot pushed a commit that referenced this pull request Mar 14, 2022
@mdelapenya mdelapenya added backport-v8.0.0 Automated backport with mergify backport-v7.17.0 Automated backport with mergify labels Mar 14, 2022
@mdelapenya
Copy link
Copy Markdown
Contributor

mdelapenya commented Mar 14, 2022

@Mergifyio refresh

mergify bot pushed a commit that referenced this pull request Mar 14, 2022
mergify bot pushed a commit that referenced this pull request Mar 14, 2022
mdelapenya added a commit that referenced this pull request Mar 15, 2022
…it req for the scenarios (#2232)

* Separate ES workload from Agent spec and make it req for the scenarios (#2203)

(cherry picked from commit 1e49c28)

* chore: fix end-of-file

Co-authored-by: Chris Mark <chrismarkou92@gmail.com>
Co-authored-by: Manuel de la Peña <mdelapenya@gmail.com>
ChrsMark added a commit that referenced this pull request Mar 15, 2022
…t req for the scenarios (#2230)

* Separate ES workload from Agent spec and make it req for the scenarios (#2203)

(cherry picked from commit 1e49c28)

* chore: fix end-of-file

* Add jq installation

Co-authored-by: Chris Mark <chrismarkou92@gmail.com>
Co-authored-by: Manuel de la Peña <mdelapenya@gmail.com>
mdelapenya added a commit that referenced this pull request Mar 15, 2022
…t req for the scenarios (#2231)

* Separate ES workload from Agent spec and make it req for the scenarios (#2203)

(cherry picked from commit 1e49c28)

* chore: fix end-of-file

Co-authored-by: Chris Mark <chrismarkou92@gmail.com>
Co-authored-by: Manuel de la Peña <mdelapenya@gmail.com>
mdelapenya added a commit that referenced this pull request Mar 17, 2022
* main: (268 commits)
  bump stack version 8.2.0-ff67d7b8 (#2242)
  ci: periodic builds for the last two minor versions (#2241)
  bump stack version 8.2.0-9bac538c (#2240)
  fix: add end-of-file line in templates (#2237)
  chore: add Christos to SSH users
  bump stack version 8.2.0-fee3b8d2 (#2234)
  Separate ES workload from Agent spec and make it req for the scenarios (#2203)
  bump stack version 8.2.0-d02c907a (#2228)
  bump stack version 8.2.0-63265ec9 (#2225)
  fix: wrong link (#2220)
  chore: define provider for each test suite (#2213)
  ci: provision e2-small workers (#2212)
  bump stack version 8.2.0-a12f7069 (#2209)
  chore: support for Oracle Linux 8 (#2208)
  bump stack version 8.2.0-190af159 (#2206)
  bump stack version 8.2.0-fdde08ec (#2200)
  bump stack version 8.2.0-43df679f (#2193)
  chore: use trace level in logs (#2130)
  bump stack version 8.2.0-bdf2ad74 (#2188)
  feat: support downloading project artifacts for the new bucket layout (#2172)
  ...
mdelapenya added a commit to mdelapenya/e2e-testing that referenced this pull request Mar 30, 2022
* main: (72 commits)
  fix: always quote variables in shell scripts (elastic#2284)
  chore: bring back system integration scenarios (elastic#2233)
  chore: do not notify build green for fleet nightly builds (elastic#2277)
  chore: include suite, platform and tags in test reports name (elastic#2273)
  bump stack version 8.2.0-dcff22d7 (elastic#2270)
  bump stack version 8.2.0-4509f321 (elastic#2265)
  chore: bump Godog to v0.12.4 (elastic#2245)
  bump stack version 8.2.0-5cc993c1 (elastic#2254)
  bump stack version 8.2.0-82b803a1 (elastic#2250)
  fix: never process elasticsearch CI artifacts (elastic#2246)
  bump stack version 8.2.0-ff67d7b8 (elastic#2242)
  ci: periodic builds for the last two minor versions (elastic#2241)
  bump stack version 8.2.0-9bac538c (elastic#2240)
  fix: add end-of-file line in templates (elastic#2237)
  chore: add Christos to SSH users
  bump stack version 8.2.0-fee3b8d2 (elastic#2234)
  Separate ES workload from Agent spec and make it req for the scenarios (elastic#2203)
  bump stack version 8.2.0-d02c907a (elastic#2228)
  bump stack version 8.2.0-63265ec9 (elastic#2225)
  fix: wrong link (elastic#2220)
  ...
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

backport-v7.17.0 Automated backport with mergify backport-v8.0.0 Automated backport with mergify backport-v8.1.0 Automated backport with mergify backport-v8.2.0 Automated backport with mergify

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants