Call fleet-server audit/unenroll endpoint on uninstall by michel-laterman · Pull Request #5302 · elastic/elastic-agent

michel-laterman · 2024-08-14T22:43:16Z

What does this PR do?

Uninstalling a fleet-managed elastic-agent instance will now do a best-effort attempt to notify fleet-server of the agent removal so the agent may not appear as offiline in the UI.

Requires fleet-server PR elastic/fleet-server#3818 to be merged in order for integration tests to succeed.

Why is it important?

Uninstalling the agent leaves inactive (offline) entries in the UI that clutter up the list.
This is an attempt to treat these agents similarly to agents that are unenrolled.

Checklist

My code follows the style guidelines of this project
I have commented my code, particularly in hard-to-understand areas
~~I have made corresponding changes to the documentation~~
~~I have made corresponding change to the default configuration files~~
I have added tests that prove my fix is effective or that my feature works
I have added an entry in ./changelog/fragments using the changelog tool
I have added an integration test or an E2E test

Disruptive User Impact

No disruptive impact, the change is a best-effort api call.

How to test this PR locally

TODO

Related issues

Relates [Feature Request] Have Elastic Agent send a final message to its fleet server when making changes #484

mergify · 2024-08-14T22:43:58Z

This pull request does not have a backport label. Could you fix it @michel-laterman? 🙏
To fixup this pull request, you need to add the backport labels for the needed
branches, such as:

backport-v./d./d./d is the label to automatically backport to the 8./d branch. /d is the digit

NOTE: backport-skip has been added to this pull request.

Uninstalling a fleet-managed elastic-agent instance will now do a best-effort attempt to notify fleet-server of the agent removal so the agent may not appear as offiline.

mergify · 2024-08-27T13:24:47Z

This pull request is now in conflicts. Could you fix it? 🙏
To fixup this pull request, you can check out it locally. See documentation: https://help.github.com/articles/checking-out-pull-requests-locally/

git fetch upstream
git checkout -b uninstall-notify-fleet upstream/uninstall-notify-fleet
git merge upstream/main
git push upstream uninstall-notify-fleet

ycombinator · 2024-08-27T13:56:29Z

@michel-laterman now that elastic/fleet-server#3818 has been merged, are you able to make progress on this PR here?

elasticmachine · 2024-08-28T22:39:51Z

Pinging @elastic/elastic-agent-control-plane (Team:Elastic-Agent-Control-Plane)

pchila · 2024-08-30T12:33:24Z

+func (e *AuditUnenrollRequest) Validate() error {
+	if e.Timestamp.IsZero() {
+		return &ReqError{fmt.Errorf("request timestamp not set")}
+	}
+	switch e.Reason {
+	case ReasonUninstall:
+	default:
+		return &ReqError{fmt.Errorf("unsupported reason: %s", e.Reason)}
+	}
+	return nil
+}


Shouldn't this validation be performed on the server side ? What happens if different version of agent and fleet have different validations?

There is also request validation done server side; currently the agent's is more restrictive (only allowing one reason)

pchila · 2024-08-30T12:40:16Z

+			mux := http.NewServeMux()
+			path := fmt.Sprintf(auditUnenrollPath, agentInfo.AgentID())
+			mux.HandleFunc(path, authHandler(func(w http.ResponseWriter, r *http.Request) {
+				w.WriteHeader(http.StatusOK)


Why do we write a 200 OK header before validating the request body with the requires ?
maybe this handler can verify the request with asserts and return a 200 if everything is ok and something else (maybe a 400) if something is not what we expect ?

this is a dummy handler; i don't expect that that the fleetapi level we care about the response (as the uninstall function controls retries), i'll move the write header to the last step

pchila · 2024-08-30T12:43:11Z

+	response, err := info.ESClient.Get(".fleet-agents", agentID, info.ESClient.Get.WithContext(ctx))
+	require.NoError(t, err)
+	defer response.Body.Close()
+	p, err := io.ReadAll(response.Body)
+	require.NoError(t, err)
+	require.Equalf(t, http.StatusOK, response.StatusCode, "ES status code expected 200, body: %s", p)
+	var res struct {
+		Source struct {
+			AuditUnenrolledReason string `json:"audit_unenrolled_reason"`
+		} `json:"_source"`
+	}


Isn't there a fleet endpoint to check the audit reason for the agent? I would prefer not to query directly a fleet index from agent...

In an earlier attempt I tried using info.KibanaClient.SendWithContext(ctx, http.MethodGet, "/api/fleet/agents/"+agentID, nil, nil, nil) to get the agent, however the fleet api's return information does not include the audit_unenrolled_reason attribute

I agree with @pchila that we should avoid directly querying ES indices from Agent. I've created elastic/kibana#194884 for GET /api/fleet/agents/:id to return audit_unenrolled_reason so we can update this test in the future to use that API when it's ready.

Also created #5694 to add a TODO comment here to make the switch when the Fleet API is ready.

blakerouse · 2024-08-30T15:14:11Z

+		case http.StatusOK:
+			pt.Describe("notify fleet-server success")
+			return
+		case http.StatusBadRequest, http.StatusUnauthorized, http.StatusConflict:


Why not retry on StatusConflict?

Added comment on why each of these are not retryable.

Co-authored-by: Paolo Chilà <paolo.chila@elastic.co> Co-authored-by: Blake Rouse <blake.rouse@elastic.co>

mergify · 2024-09-10T21:10:02Z

backport-v8.x has been added to help with the transition to the new branch 8.x.

mergify · 2024-09-11T11:50:33Z

backport-v8.x has been added to help with the transition to the new branch 8.x.

cmacknz · 2024-10-02T20:13:16Z

+	}
+
+	if cfg != nil && !configuration.IsStandalone(cfg.Fleet) {
+		ai, err = info.NewAgentInfo(ctx, false)


The context here is set as ctx := context.Background(), the context should probably be an input into the Uninstall and tied to the lifetime of the uninstall command.

If you are 30s into a retry loop for the uninstall endpoint does CTRL-C correctly cause it to terminate?

👍 , now passing cmd.Context() to Uninstall

elastic-sonarqube · 2024-10-03T15:51:36Z

Quality Gate passed

Issues
1 New issue
1 Fixed issue
0 Accepted issues

Measures
0 Security Hotspots
57.1% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube

blakerouse

Looks like all the review comments have been resolved, you have added the cmd.Context() in the places required to allow the ctrl-c to work.

Looks good!

* Call fleet-server audit/unenroll endpoint on uninstall Uninstalling a fleet-managed elastic-agent instance will now do a best-effort attempt to notify fleet-server of the agent removal so the agent may not appear as offiline. --------- Co-authored-by: Paolo Chilà <paolo.chila@elastic.co> Co-authored-by: Blake Rouse <blake.rouse@elastic.co> (cherry picked from commit 07c2a92)

* Call fleet-server audit/unenroll endpoint on uninstall Uninstalling a fleet-managed elastic-agent instance will now do a best-effort attempt to notify fleet-server of the agent removal so the agent may not appear as offiline. --------- Co-authored-by: Paolo Chilà <paolo.chila@elastic.co> Co-authored-by: Blake Rouse <blake.rouse@elastic.co> (cherry picked from commit 07c2a92) Co-authored-by: Michel Laterman <82832767+michel-laterman@users.noreply.github.com>

michel-laterman added enhancement New feature or request Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team labels Aug 14, 2024

mergify bot assigned michel-laterman Aug 14, 2024

mergify bot added the backport-skip label Aug 14, 2024

Call fleet-server audit/unenroll endpoint on uninstall

bb4d789

Uninstalling a fleet-managed elastic-agent instance will now do a best-effort attempt to notify fleet-server of the agent removal so the agent may not appear as offiline.

michel-laterman force-pushed the uninstall-notify-fleet branch from 34c721e to bb4d789 Compare August 14, 2024 22:44

michel-laterman and others added 4 commits August 19, 2024 09:34

Merge branch 'main' into uninstall-notify-fleet

4f10386

Fix goimports issue

af24deb

Merge branch 'main' into uninstall-notify-fleet

1a03876

fix integration test

5d5384f

michel-laterman and others added 6 commits August 27, 2024 09:34

fix typo

5b57d56

Merge branch 'main' into uninstall-notify-fleet

629f81e

Change to using ES client instead of Kibana client

1636de3

Add context to retrieval

1dbbdab

print installed agent info

93cb36f

fix integration test by using fleet-server enrollment

666f585

michel-laterman marked this pull request as ready for review August 28, 2024 22:39

michel-laterman requested a review from a team as a code owner August 28, 2024 22:39

michel-laterman requested review from blakerouse and pchila August 28, 2024 22:39

pchila reviewed Aug 30, 2024

View reviewed changes

blakerouse reviewed Aug 30, 2024

View reviewed changes

michel-laterman and others added 4 commits September 3, 2024 10:00

Apply suggestions from code review

46a146e

Co-authored-by: Paolo Chilà <paolo.chila@elastic.co> Co-authored-by: Blake Rouse <blake.rouse@elastic.co>

Review feedback changes

0368806

fix linter

1a45f1f

Merge branch 'main' into uninstall-notify-fleet

1fc4a3f

mergify bot added the backport-v8.x label Sep 10, 2024

mergify bot added the backport-8.x Automated backport to the 8.x branch with mergify label Sep 11, 2024

v1v removed the backport-v8.x label Sep 11, 2024

michel-laterman and others added 6 commits September 20, 2024 11:05

Merge branch 'main' into uninstall-notify-fleet

0a3e011

Merge branch 'main' into uninstall-notify-fleet

38f47f5

add license headers

10ecf81

Merge branch 'main' into uninstall-notify-fleet

76a9396

add unit test, fix 401 handling

facba36

Fix typo

c7978d1

cmacknz reviewed Oct 2, 2024

View reviewed changes

Comment thread internal/pkg/agent/install/uninstall.go

cmacknz reviewed Oct 2, 2024

View reviewed changes

michel-laterman and others added 2 commits October 2, 2024 14:33

jitter backoff, and cancelable uninstall

aba7976

Merge branch 'main' into uninstall-notify-fleet

c1c9385

blakerouse approved these changes Oct 3, 2024

View reviewed changes

michel-laterman merged commit 07c2a92 into elastic:main Oct 3, 2024

michel-laterman deleted the uninstall-notify-fleet branch October 3, 2024 17:02

mergify bot mentioned this pull request Oct 3, 2024

[8.x](backport #5302) Call fleet-server audit/unenroll endpoint on uninstall #5688

Merged

5 tasks

This was referenced Oct 3, 2024

Add comment to replace ES query with API call #5694

Merged

[Feature Request] Have Elastic Agent send a final message to its fleet server when making changes #484

Open

mergify bot mentioned this pull request Oct 4, 2024

[8.x](backport #5694) Add comment to replace ES query with API call #5700

Merged

michel-laterman mentioned this pull request Oct 10, 2024

Add skip audit/unenroll flag to uninstall command #5757

Closed

ycombinator mentioned this pull request Oct 24, 2024

[Fleet] Show reason for Agent / Endpoint uninstallation elastic/kibana#197731

Closed

cmacknz mentioned this pull request Nov 18, 2024

Agent throws exception when uninstalling on windows #5952

Closed

Conversation

michel-laterman commented Aug 14, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Why is it important?

Checklist

Disruptive User Impact

How to test this PR locally

Related issues

Uh oh!

mergify bot commented Aug 14, 2024

Uh oh!

mergify bot commented Aug 27, 2024

Uh oh!

ycombinator commented Aug 27, 2024

Uh oh!

elasticmachine commented Aug 28, 2024

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

michel-laterman Sep 3, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mergify bot commented Sep 10, 2024

Uh oh!

mergify bot commented Sep 11, 2024

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

elastic-sonarqube bot commented Oct 3, 2024

Quality Gate passed

Uh oh!

blakerouse left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

michel-laterman commented Aug 14, 2024 •

edited

Loading

michel-laterman Sep 3, 2024 •

edited

Loading