Skip to content

Added skip audit/unenroll flag to uninstall command#6206

Merged
michalpristas merged 21 commits intoelastic:mainfrom
Rohit-code14:main
Dec 31, 2024
Merged

Added skip audit/unenroll flag to uninstall command#6206
michalpristas merged 21 commits intoelastic:mainfrom
Rohit-code14:main

Conversation

@Rohit-code14
Copy link
Copy Markdown
Contributor

@Rohit-code14 Rohit-code14 commented Dec 4, 2024

  • Enhancement

What does this PR do?

Added a flag to skip fleet audit/unenroll while uninstalling elastic-agent.

Why is it important?

While uninstalling elastic-agent it tries to notify fleet server about the uninstall. But in somecases users might know that the fleet server is unreachable and this notification logs multiple failures continuously. Therefore having an option to skip this would enhance the end user experience.

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • [ ] I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in ./changelog/fragments using the changelog tool
  • [ ] I have added an integration test or an E2E test

Disruptive User Impact

No disruptive user impact

How to test this PR locally

  • Setup ES and Kibana.
  • Proceed with Fleetserver installation.
  • Build the agent locally by running SNAPSHOT=true PLATFORMS=linux/arm64 PACKAGES=tar.gz mage -v package
  • Extract the agent tar.
  • Go to Kibana -> Fleet page -> Add Agent gives installation command. Use the same to install elastic-agent.
  • After successful installation, Check in fleet page for the installed agent.
  • Kill the fleet servers process.
  • Try uninstalling elastic-agent using sudo elastic-agent uninstall
  • Now as the fleet server is unreachable, the notify calls will fail. You will get the error logs like below
    [== ] notify Fleet: network error: fail to notify audit/unenroll on fleet-server: all hosts failed: requester 0/1 to host https://192.168.1.10:8220/ errored: Post "https://192.168.1.10:8220/api/fleet/agents/a4888371-ef7b-4cc6-8df7-cc92b6c25990/audit/unenroll?": dial tcp 192.168.1.10:8220: connect:
  • Now restart the fleet server and install the elastic agent again.
  • After successful installation, kill the fleet server.
  • Now try uninstalling elastic-agent using sudo elastic-agent uninstall --skip-fleet-audit. The audit/unenroll call will be skipped and agent uninstall will happen without any errors.

Related issues

@mergify
Copy link
Copy Markdown
Contributor

mergify bot commented Dec 4, 2024

This pull request does not have a backport label. Could you fix it @Rohit-code14? 🙏
To fixup this pull request, you need to add the backport labels for the needed
branches, such as:

  • backport-./d./d is the label to automatically backport to the 8./d branch. /d is the digit

@mergify
Copy link
Copy Markdown
Contributor

mergify bot commented Dec 4, 2024

backport-v8.x has been added to help with the transition to the new branch 8.x.
If you don't need it please use backport-skip label and remove the backport-8.x label.

@mergify mergify bot added the backport-8.x Automated backport to the 8.x branch with mergify label Dec 4, 2024
@Rohit-code14 Rohit-code14 marked this pull request as ready for review December 4, 2024 03:59
@Rohit-code14 Rohit-code14 requested a review from a team as a code owner December 4, 2024 03:59
@pkoutsovasilis
Copy link
Copy Markdown
Contributor

/test

Copy link
Copy Markdown

@kaanyalti kaanyalti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Rohit-code14 Could you please add tests relevant to what you are doing here? One suggestion is to inject notifyFleetAuditUninstall function. You can create a function type and create a mock with it using mockery which you can use in your test.

@Rohit-code14
Copy link
Copy Markdown
Contributor Author

@Rohit-code14 Could you please add tests relevant to what you are doing here? One suggestion is to inject notifyFleetAuditUninstall function. You can create a function type and create a mock with it using mockery which you can use in your test.

Hi @kaanyalti , Even if we inject notifyFleetAuditUninstall into Uninstall and try mocking the same to test it. To test it we need to directly invoke Uninstall function by passing the mock notify function during the test run. Which is not hard to do. I think we can refactor the pre-validations that are done to before invoking notifyFleetAuditUninstall function into a separate function and have a test case that mocks the new refactored function.
Something like this

func notifyFleetIfNeeded(ctx context.Context, log *logp.Logger, pt *progressbar.ProgressBar, cfg *configuration.Configuration, ai *info.AgentInfo, skipFleetAudit bool, notifyFleetAuditUninstall NotifyFleetAuditUninstall) {
// Once the root-cause is identified then this can be re-enabled on Windows.
if ai != nil && cfg != nil && !skipFleetAudit && runtime.GOOS != "windows" {
notifyFleetAuditUninstall(ctx, log, pt, cfg, ai) //nolint:errcheck // ignore the error as we can't act on it)
}
}

Please share your opinion on this

@kaanyalti
Copy link
Copy Markdown

@Rohit-code14 I am having a hard time visualizing your suggestion based on the example you provided. Can you either implement your suggestion or provide a more concrete example so it is clearer?

@Rohit-code14
Copy link
Copy Markdown
Contributor Author

@Rohit-code14 I am having a hard time visualizing your suggestion based on the example you provided. Can you either implement your suggestion or provide a more concrete example so it is clearer?

@kaanyalti I have added changes and testcase, kindly review the same.

Copy link
Copy Markdown

@kaanyalti kaanyalti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested this on mac and linux, works well. Changes look good to me.

@pkoutsovasilis
Copy link
Copy Markdown
Contributor

/test

@Rohit-code14
Copy link
Copy Markdown
Contributor Author

Hi @kaanyalti , I cannot see the reasons for these failures, Can you please help me with this?

@Rohit-code14
Copy link
Copy Markdown
Contributor Author

@Rohit-code14 Could you please resolve the merge conflicts?

Done!

@pkoutsovasilis
Copy link
Copy Markdown
Contributor

/test

@Rohit-code14
Copy link
Copy Markdown
Contributor Author

Quality Gate failed Quality Gate failed

Failed conditions 12.5% Coverage on New Code (required ≥ 40%)

See analysis details on SonarQube

Should anything be done from my side for this?

@kaanyalti
Copy link
Copy Markdown

@Rohit-code14 I'm going to look into why sonarqube is complaining, you may have to increase code coverage, although right now what you have implemented looks good to me.

@Rohit-code14
Copy link
Copy Markdown
Contributor Author

@Rohit-code14 I'm going to look into why sonarqube is complaining, you may have to increase code coverage, although right now what you have implemented looks good to me.

Okay let me know once you have checked the it.

Comment thread internal/pkg/agent/install/uninstall_test.go Outdated
@Rohit-code14
Copy link
Copy Markdown
Contributor Author

Rohit-code14 commented Dec 23, 2024

@kaanyalti @pkoutsovasilis

Please trigger tests for this.

@pkoutsovasilis
Copy link
Copy Markdown
Contributor

/test

@Rohit-code14
Copy link
Copy Markdown
Contributor Author

@pkoutsovasilis sorry to ping you again. One of the buildkite testcase failed. I have pulled changes from main. So please retrigger the tests once. And also the sonarqube coverage test failed once again how can we proceed with it?

@pkoutsovasilis
Copy link
Copy Markdown
Contributor

@pkoutsovasilis sorry to ping you again. One of the buildkite testcase failed. I have pulled changes from main. So please retrigger the tests once. And also the sonarqube coverage test failed once again how can we proceed with it?

No worries at all, and sorry for this getting stuck. We are currently discussing with the team to make the SonarQube step a non-required one, so hopefully, once that change is in place, your PR will be merged smoothly. I’ll retrigger the tests now. 🙂

@pkoutsovasilis
Copy link
Copy Markdown
Contributor

/test

@Rohit-code14
Copy link
Copy Markdown
Contributor Author

Rohit-code14 commented Dec 24, 2024

@pkoutsovasilis sorry to ping you again. One of the buildkite testcase failed. I have pulled changes from main. So please retrigger the tests once. And also the sonarqube coverage test failed once again how can we proceed with it?

No worries at all, and sorry for this getting stuck. We are currently discussing with the team to make the SonarQube step a non-required one, so hopefully, once that change is in place, your PR will be merged smoothly. I’ll retrigger the tests now. 🙂

Thanks @pkoutsovasilis . I am very much excited as it's my first PR. Looking forward on merging this.

Copy link
Copy Markdown
Contributor

@michalpristas michalpristas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for addressing changes in test code, looks much better now

@jlind23
Copy link
Copy Markdown
Contributor

jlind23 commented Dec 31, 2024

buildkite test it

@elastic-sonarqube
Copy link
Copy Markdown

Quality Gate failed Quality Gate failed

Failed conditions
12.5% Coverage on New Code (required ≥ 40%)

See analysis details on SonarQube

@michalpristas michalpristas merged commit 41e7499 into elastic:main Dec 31, 2024
@michalpristas
Copy link
Copy Markdown
Contributor

merged, many thanks for your contribution @Rohit-code14

mergify bot pushed a commit that referenced this pull request Dec 31, 2024
* Added skip audit/unenroll flag to uninstall command

* Added tests for skip-fleet-audit flag

* Added tests for skip-fleet-audit flag

* Added tests for skip-fleet-audit flag

* ran mage fmt

* Test all combinations for which fleet audit/unenroll will be skipped

---------

Co-authored-by: Michal Pristas <michal.pristas@gmail.com>
(cherry picked from commit 41e7499)
@Rohit-code14
Copy link
Copy Markdown
Contributor Author

merged, many thanks for your contribution @Rohit-code14

@kaanyalti @pkoutsovasilis @michalpristas
Thanks for helping me with this. Wishing you guy's Advance Happy New Year 🥳

jlind23 pushed a commit that referenced this pull request Dec 31, 2024
* Added skip audit/unenroll flag to uninstall command

* Added tests for skip-fleet-audit flag

* Added tests for skip-fleet-audit flag

* Added tests for skip-fleet-audit flag

* ran mage fmt

* Test all combinations for which fleet audit/unenroll will be skipped

---------

Co-authored-by: Michal Pristas <michal.pristas@gmail.com>
(cherry picked from commit 41e7499)

Co-authored-by: Rohit <59366992+Rohit-code14@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport-8.x Automated backport to the 8.x branch with mergify

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add skip audit/unenroll flag to uninstall command

6 participants