Skip to content
This repository was archived by the owner on Sep 17, 2024. It is now read-only.

[7.x](backport #1153) feat: check how many processes of a process are running in the host#1154

Merged
mdelapenya merged 4 commits into7.xfrom
mergify/bp/7.x/pr-1153
May 14, 2021
Merged

[7.x](backport #1153) feat: check how many processes of a process are running in the host#1154
mdelapenya merged 4 commits into7.xfrom
mergify/bp/7.x/pr-1153

Conversation

@mergify
Copy link
Copy Markdown
Contributor

@mergify mergify bot commented May 10, 2021

This is an automatic backport of pull request #1153 done by Mergify.


Mergify commands and options

More conditions and actions can be found in the documentation.

You can also trigger Mergify actions by commenting on this pull request:

  • @Mergifyio refresh will re-evaluate the rules
  • @Mergifyio rebase will rebase this PR on its base branch
  • @Mergifyio update will merge the base branch into this PR
  • @Mergifyio backport <destination> will backport this PR on <destination> branch

Additionally, on Mergify dashboard you can:

  • look at your merge queues
  • generate the Mergify configuration with the config editor.

Finally, you can contact us on https://mergify.io/

…1153)

* fix: use docker's stdcopy to separate stdout from stderr

This will allow removing the initial bytes when reading outputs from command
execution in a container

* feat: support checking the number of occurrences of a process in a container

It uses pgrep to get all pids for a process, and then iterates through them
to get the runnable status for each pid. If the process must be started
in the host, then it will check that the pid is in the S status (to skip
zombie processes)

* fix: check for only one filebeat instance

* fix: check for empty response when listing agent's workdir

(cherry picked from commit 78a0d49)
@elasticmachine
Copy link
Copy Markdown
Contributor

elasticmachine commented May 10, 2021

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview

Expand to view the summary

Build stats

  • Build Cause: Pull request #1154 updated

  • Start Time: 2021-05-14T18:31:14.570+0000

  • Duration: 19 min 29 sec

  • Commit: 89ca286

Test stats 🧪

Test Results
Failed 0
Passed 159
Skipped 0
Total 159

Trends 🧪

Image of Build Times

Image of Tests

💚 Flaky test report

Tests succeeded.

Expand to view the summary

Test stats 🧪

Test Results
Failed 0
Passed 159
Skipped 0
Total 159

@mdelapenya
Copy link
Copy Markdown
Contributor

Will double check why the tests find only 1 filebeat instance in 7.x. Do not merge until resolution

When the "elastic-agent" process is in the "started" state on the host
Then the "filebeat" process is in the "started" state on the host
And the "metricbeat" process is in the "started" state on the host
Then there are "2" instances of the "filebeat" process in the "started" state
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nchaulet this scenario is passing in master (this PR is a backport from #1153) but failing for both 7.x and 7.13 (see #1155)

It's weird than the 4 scenarios that are checking for 2 filebeat instances are failing in both maintenance branches. The logs say that only one filebeat process is in the running state. We run ps -q $PID -o state" --no-headers for each filebeat PID, waiting it is in the S state, which is the one we observed in the containers.

From 'man ps':

			// D    uninterruptible sleep (usually IO)
			// R    running or runnable (on run queue)
			// S    interruptible sleep (waiting for an event to complete)
			// T    stopped by job control signal
			// t    stopped by debugger during the tracing
			// W    paging (not valid since the 2.6.xx kernel)
			// X    dead (should never be seen)
			// Z    defunct ("zombie") process, terminated but not reaped by its parent

Screenshot 2021-05-12 at 00 07 35

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@EricDavisX I'm totally confused with this scenario in the maintenance branches. Do you see why it is failing here?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry i wasn't as responsive the last 2 days, i was heads down on triaging other critical fixes. i am back form the edge now. I think this is resolved tho, yes?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, we reduced the number of processes to 1 (FB and MB), but we did not change how the policy is created/assigned. How could the tests be notified about this kind of changes (apart of being broken)?

Copy link
Copy Markdown
Contributor

@mdelapenya mdelapenya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As @ph suggested:

It's because it doesn't need to start another filebeat or metricbeat in this context, the policy only has the fleet-server integration.
If a system integration was added to the policy you will get 2 FB/2MB

we are updating the number of instances for filebeat, from 2 to 1

@mdelapenya mdelapenya merged commit 8b56088 into 7.x May 14, 2021
@mergify mergify bot deleted the mergify/bp/7.x/pr-1153 branch May 14, 2021 19:06
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants