Skip to content

Improve logging around Fleet connection errors #1154

@joshdover

Description

@joshdover

Whenever we fail to connect or disconnect before getting a response from Fleet on any API endpoint we should:

  • Include more detail about the connection
    • How long was the connection open
    • When there's a DNS failure, log the DNS transaction
  • Include more detail about how Agent is going to handle the failure
    • Is the agent going to retry, and if so, how long from now?

We should also consider making these logs less scary to users since they are expected to happen from time to time. I propose that we move the current logging from warn to info with language to explain it's intermittent and only log at the warn level if we've encountered the same issue 3 out of the last 5 attempts. Today it will skip the first instance and log only if it happens twice in a row

Metadata

Metadata

Assignees

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions