Skip to content

Hubble: Emit v1.Event on packet drop#29565

Merged
rolinh merged 3 commits intocilium:mainfrom
robinelfrink:drop-event-emitter
Feb 22, 2024
Merged

Hubble: Emit v1.Event on packet drop#29565
rolinh merged 3 commits intocilium:mainfrom
robinelfrink:drop-event-emitter

Conversation

@robinelfrink
Copy link
Copy Markdown
Contributor

@robinelfrink robinelfrink commented Dec 2, 2023

This adds the option to hubble to emit v1.Events related to v1.Pods when a packet drop to or from that pod is detected.

Please ensure your pull request adheres to the following guidelines:

  • For first time contributors, read Submitting a pull request
  • All code is covered by unit and/or runtime tests where feasible.
  • All commits contain a well written commit description including a title,
    description and a Fixes: #XXX line if the commit addresses a particular
    GitHub issue.
  • If your commit description contains a Fixes: <commit-id> tag, then
    please add the commit author[s] as reviewer[s] to this issue.
  • All commits are signed off. See the section Developer’s Certificate of Origin
  • Provide a title or release-note blurb suitable for the release notes.
  • Are you a user of Cilium? Please add yourself to the Users doc
  • Thanks for contributing!

Fixes: #29399

Hubble now has an option to emit v1.Events related to pods on detection of packet drops.

@robinelfrink robinelfrink requested review from a team as code owners December 2, 2023 06:45
@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Dec 2, 2023
@github-actions github-actions bot added the kind/community-contribution This was a contribution made by a community member. label Dec 2, 2023
@robinelfrink
Copy link
Copy Markdown
Contributor Author

robinelfrink commented Dec 2, 2023

TODO:

  • Add tests
  • Add options to the Helm chart

@rolinh rolinh added release-note/minor This PR changes functionality that users may find relevant to operating Cilium. sig/hubble labels Dec 2, 2023
@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Dec 2, 2023
@kaworu kaworu requested a review from rolinh December 4, 2023 10:26
@kaworu kaworu marked this pull request as draft December 4, 2023 10:26
@kaworu
Copy link
Copy Markdown
Member

kaworu commented Dec 4, 2023

Hello @robinelfrink 👋 and thank you for the PR!

I've marked it as Draft since you have some TODO left. Feel free to make it ready to review once you've completed them and rebased on top of main. Also pulling @rolinh to the review whom has context about #29399.

@aanm aanm added dont-merge/blocked Another PR must be merged before this one. dont-merge/wait-until-release Freeze window for current release is blocking non-bugfix PRs and removed dont-merge/blocked Another PR must be merged before this one. labels Dec 4, 2023
@maintainer-s-little-helper
Copy link
Copy Markdown

Commit b57d999 does not match "(?m)^Signed-off-by:".

Please follow instructions provided in https://docs.cilium.io/en/stable/contributing/development/contributing_guide/#developer-s-certificate-of-origin

@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-sign-off The author needs to add signoff to their commits before merge. label Dec 5, 2023
@maintainer-s-little-helper
Copy link
Copy Markdown

Commit b57d999 does not match "(?m)^Signed-off-by:".

Please follow instructions provided in https://docs.cilium.io/en/stable/contributing/development/contributing_guide/#developer-s-certificate-of-origin

@maintainer-s-little-helper
Copy link
Copy Markdown

Commit d27e288 does not match "(?m)^Signed-off-by:".

Please follow instructions provided in https://docs.cilium.io/en/stable/contributing/development/contributing_guide/#developer-s-certificate-of-origin

@robinelfrink robinelfrink marked this pull request as ready for review December 5, 2023 20:39
@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-sign-off The author needs to add signoff to their commits before merge. label Dec 5, 2023
@robinelfrink robinelfrink requested a review from a team as a code owner December 6, 2023 10:18
@maintainer-s-little-helper
Copy link
Copy Markdown

Commits 4cb3eca, 999189a do not match "(?m)^Signed-off-by:".

Please follow instructions provided in https://docs.cilium.io/en/stable/contributing/development/contributing_guide/#developer-s-certificate-of-origin

@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-sign-off The author needs to add signoff to their commits before merge. label Dec 10, 2023
@rolinh
Copy link
Copy Markdown
Member

rolinh commented Feb 1, 2024

CI failure seems legit, PR lgtm otherwise

┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃   FAIL  package: github.com/cilium/cilium/pkg/hubble/dropeventemitter   ┃
┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛
--- FAIL: TestEndpointToString (0.00s)
--- FAIL: TestEndpointToString/node (0.00s)
    dropeventemitter_test.go:47: 
        	Error Trace:	/home/travis/gopath/src/github.com/cilium/cilium/pkg/hubble/dropeventemitter/dropeventemitter_test.go:47
        	Error:      	Not equal: 
        	            	expected: "remote-node (1.2.3.4)"
        	            	actual  : "remote-node(1.2.3.4)"
                  	
        	            	Diff:
        	            	--- Expected
        	            	+++ Actual
        	            	@@ -1 +1 @@
        	            	-remote-node (1.2.3.4)
        	            	+remote-node(1.2.3.4)
        	Test:       	TestEndpointToString/node

Copy link
Copy Markdown
Contributor

@EricMountain EricMountain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@robinelfrink
Copy link
Copy Markdown
Contributor Author

CI failure seems legit, PR lgtm otherwise

Ah, yes, fix is coming up.

Copy link
Copy Markdown
Member

@rolinh rolinh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for your contribution Robin!

@rolinh
Copy link
Copy Markdown
Member

rolinh commented Feb 2, 2024

/test

Copy link
Copy Markdown
Member

@kaworu kaworu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @robinelfrink, couple of nitpicking about the Helm values but the patch LGTM now.

@robinelfrink
Copy link
Copy Markdown
Contributor Author

Rebased on a more recent main.

@rolinh
Copy link
Copy Markdown
Member

rolinh commented Feb 12, 2024

/test

This adds the option to hubble to emit v1.Events related to v1.Pods when
a packet drop to or from that pod is detected. By default a packet drop
with the same source/destination IP address results in an Event only
once per two minutes.

Fixes: #29399

Events will not currently show with `kubectl describe pod ...`; that
requires the event emitter to add the pod's `metadata.uid` which is
unknown at the time of emitting. Retrieving `metadata.uid` is considered
a too expensive call at this moment.

Signed-off-by: Robin Elfrink <robin@15augustus.nl>
Signed-off-by: Robin Elfrink <robin@15augustus.nl>
Signed-off-by: Robin Elfrink <robin.elfrink@eu.equinix.com>
@rolinh
Copy link
Copy Markdown
Member

rolinh commented Feb 14, 2024

/test

@robinelfrink
Copy link
Copy Markdown
Contributor Author

Hi @rolinh,

Is it safe to assume that the 3 failing checks are not related to my changes?

@rolinh
Copy link
Copy Markdown
Member

rolinh commented Feb 15, 2024

Hi @robinelfrink, it seems unlikely that these failures are related to your changes in this PR. I've re-triggered some tests to see if they reliably fail or are just flaky.

@ti-mo
Copy link
Copy Markdown
Contributor

ti-mo commented Feb 22, 2024

I've re-triggered ci-e2e again, hit a failure on an ingress test.

@rolinh
Copy link
Copy Markdown
Member

rolinh commented Feb 22, 2024

@danehans could you please give this PR a review?

EDIT: nevermind, someone else from sig-agent already gave a review so the PR is mergeable.

@rolinh rolinh added this pull request to the merge queue Feb 22, 2024
Merged via the queue into cilium:main with commit bcd3ec8 Feb 22, 2024
kaworu added a commit to kaworu/cilium that referenced this pull request Oct 3, 2024
This commit introduce a new Hubble cell. It is the first of a series of
refactoring patches gradually transitioning Hubble configuration and
startup code away from the Cilium daemon.

Currently, all the Hubble configuration and initialization is
implemented in the Cilium daemon. Hubble related flags (e.g.
--enable-hubble) are member of the DaemonConfig struct, and the Hubble
startup code is under daemon/cmd/hubble.go. Most notably the
launchHubble() func which handle all the Hubble component startup is
called from cmd.newDaemon().

Over time, launchHubble() has become increasingly complex, initializing
more and more Hubble subsystems while bailing out if any of them
encounter a configuration or startup error. This has lead to a situation
where a misconfiguration of one part of Hubble (e.g. metrics) could
effectively disable other Hubble components without crashing the Cilium
daemon.

The recent addition of many new Hubble feature / subsystems, for example
monitor event filtering[1], L7 data redaction[2], and k8s Events
integration on drop notifications[3] introducing new Hubble related
flags and logic both increase the potential of startup failure and the
surface of affected features.

Furthermore, most of the new feature increase Hubble's reliance on the
Cilium daemon for either flag setup/parsing and/or dependencies
injection, making it harder to decouple Hubble from the Daemon.

Fortunately, many part of Cilium already use the hive/cell framework for
configuration, startup, and dependency injection. Most notably, all
Hubble dependencies are now available as cell, which open the
possibility to convert Hubble components as cell themselves.

The goal of this patch series is to refactor Hubble into its own new
cell and decouple it from the Cilium daemon. Concretely, all the code
under daemon/cmd/hubble.go will be moved away into the new cell, along
with the flags and Hubble startup, until the point where it would be
possible to create a hive with the Hubble cell but without the Cilium
daemon.

Splitting the Hubble components into smaller cells is left for a later
time.

[1]: cilium#24828
[2]: cilium#23887
[3]: cilium#29565

Signed-off-by: Alexandre Perrin <alex@isovalent.com>
kaworu added a commit to kaworu/cilium that referenced this pull request Oct 4, 2024
This commit introduce a new Hubble cell. It is the first of a series of
refactoring patches gradually transitioning Hubble configuration and
startup code away from the Cilium daemon.

Currently, all the Hubble configuration and initialization is
implemented in the Cilium daemon. Hubble related flags (e.g.
--enable-hubble) are member of the DaemonConfig struct, and the Hubble
startup code is under daemon/cmd/hubble.go. Most notably the
launchHubble() func which handle all the Hubble component startup is
called from cmd.newDaemon().

Over time, launchHubble() has become increasingly complex, initializing
more and more Hubble subsystems while bailing out if any of them
encounter a configuration or startup error. This has lead to a situation
where a misconfiguration of one part of Hubble (e.g. metrics) could
effectively disable other Hubble components without crashing the Cilium
daemon.

The recent addition of many new Hubble feature / subsystems, for example
monitor event filtering[1], L7 data redaction[2], and k8s Events
integration on drop notifications[3] introducing new Hubble related
flags and logic both increase the potential of startup failure and the
surface of affected features.

Furthermore, most of the new feature increase Hubble's reliance on the
Cilium daemon for either flag setup/parsing and/or dependencies
injection, making it harder to decouple Hubble from the Daemon.

Fortunately, many part of Cilium already use the hive/cell framework for
configuration, startup, and dependency injection. Most notably, all
Hubble dependencies are now available as cell, which open the
possibility to convert Hubble components as cell themselves.

The goal of this patch series is to refactor Hubble into its own new
cell and decouple it from the Cilium daemon. Concretely, all the code
under daemon/cmd/hubble.go will be moved away into the new cell, along
with the flags and Hubble startup, until the point where it would be
possible to create a hive with the Hubble cell but without the Cilium
daemon.

Splitting the Hubble components into smaller cells is left for a later
time.

[1]: cilium#24828
[2]: cilium#23887
[3]: cilium#29565

Signed-off-by: Alexandre Perrin <alex@isovalent.com>
kaworu added a commit to kaworu/cilium that referenced this pull request Oct 4, 2024
This commit introduce a new Hubble cell. It is the first of a series of
refactoring patches gradually transitioning Hubble configuration and
startup code away from the Cilium daemon.

Currently, all the Hubble configuration and initialization is
implemented in the Cilium daemon. Hubble related flags (e.g.
--enable-hubble) are member of the DaemonConfig struct, and the Hubble
startup code is under daemon/cmd/hubble.go. Most notably the
launchHubble() func which handle all the Hubble component startup is
called from cmd.newDaemon().

Over time, launchHubble() has become increasingly complex, initializing
more and more Hubble subsystems while bailing out if any of them
encounter a configuration or startup error. This has lead to a situation
where a misconfiguration of one part of Hubble (e.g. metrics) could
effectively disable other Hubble components without crashing the Cilium
daemon.

The recent addition of many new Hubble feature / subsystems, for example
monitor event filtering[1], L7 data redaction[2], and k8s Events
integration on drop notifications[3] introducing new Hubble related
flags and logic both increase the potential of startup failure and the
surface of affected features.

Furthermore, most of the new feature increase Hubble's reliance on the
Cilium daemon for either flag setup/parsing and/or dependencies
injection, making it harder to decouple Hubble from the Daemon.

Fortunately, many part of Cilium already use the hive/cell framework for
configuration, startup, and dependency injection. Most notably, all
Hubble dependencies are now available as cell, which open the
possibility to convert Hubble components as cell themselves.

The goal of this patch series is to refactor Hubble into its own new
cell and decouple it from the Cilium daemon. Concretely, all the code
under daemon/cmd/hubble.go will be moved away into the new cell, along
with the flags and Hubble startup, until the point where it would be
possible to create a hive with the Hubble cell but without the Cilium
daemon.

Splitting the Hubble components into smaller cells is left for a later
time.

[1]: cilium#24828
[2]: cilium#23887
[3]: cilium#29565

Signed-off-by: Alexandre Perrin <alex@isovalent.com>
kaworu added a commit to kaworu/cilium that referenced this pull request Oct 4, 2024
This commit introduce a new Hubble cell. It is the first of a series of
refactoring patches gradually transitioning Hubble configuration and
startup code away from the Cilium daemon.

Currently, all the Hubble configuration and initialization is
implemented in the Cilium daemon. Hubble related flags (e.g.
--enable-hubble) are member of the DaemonConfig struct, and the Hubble
startup code is under daemon/cmd/hubble.go. Most notably the
launchHubble() func which handle all the Hubble component startup is
called from cmd.newDaemon().

Over time, launchHubble() has become increasingly complex, initializing
more and more Hubble subsystems while bailing out if any of them
encounter a configuration or startup error. This has lead to a situation
where a misconfiguration of one part of Hubble (e.g. metrics) could
effectively disable other Hubble components without crashing the Cilium
daemon.

The recent addition of many new Hubble feature / subsystems, for example
monitor event filtering[1], L7 data redaction[2], and k8s Events
integration on drop notifications[3] introducing new Hubble related
flags and logic both increase the potential of startup failure and the
surface of affected features.

Furthermore, most of the new feature increase Hubble's reliance on the
Cilium daemon for either flag setup/parsing and/or dependencies
injection, making it harder to decouple Hubble from the Daemon.

Fortunately, many part of Cilium already use the hive/cell framework for
configuration, startup, and dependency injection. Most notably, all
Hubble dependencies are now available as cell, which open the
possibility to convert Hubble components as cell themselves.

The goal of this patch series is to refactor Hubble into its own new
cell and decouple it from the Cilium daemon. Concretely, all the code
under daemon/cmd/hubble.go will be moved away into the new cell, along
with the flags and Hubble startup, until the point where it would be
possible to create a hive with the Hubble cell but without the Cilium
daemon.

Splitting the Hubble components into smaller cells is left for a later
time.

[1]: cilium#24828
[2]: cilium#23887
[3]: cilium#29565

Signed-off-by: Alexandre Perrin <alex@isovalent.com>
kaworu added a commit to kaworu/cilium that referenced this pull request Oct 4, 2024
This commit introduce a new Hubble cell. It is the first of a series of
refactoring patches gradually transitioning Hubble configuration and
startup code away from the Cilium daemon.

Currently, all the Hubble configuration and initialization is
implemented in the Cilium daemon. Hubble related flags (e.g.
--enable-hubble) are member of the DaemonConfig struct, and the Hubble
startup code is under daemon/cmd/hubble.go. Most notably the
launchHubble() func which handle all the Hubble component startup is
called from cmd.newDaemon().

Over time, launchHubble() has become increasingly complex, initializing
more and more Hubble subsystems while bailing out if any of them
encounter a configuration or startup error. This has lead to a situation
where a misconfiguration of one part of Hubble (e.g. metrics) could
effectively disable other Hubble components without crashing the Cilium
daemon.

The recent addition of many new Hubble feature / subsystems, for example
monitor event filtering[1], L7 data redaction[2], and k8s Events
integration on drop notifications[3] introducing new Hubble related
flags and logic both increase the potential of startup failure and the
surface of affected features.

Furthermore, most of the new feature increase Hubble's reliance on the
Cilium daemon for either flag setup/parsing and/or dependencies
injection, making it harder to decouple Hubble from the Daemon.

Fortunately, many part of Cilium already use the hive/cell framework for
configuration, startup, and dependency injection. Most notably, all
Hubble dependencies are now available as cell, which open the
possibility to convert Hubble components as cell themselves.

The goal of this patch series is to refactor Hubble into its own new
cell and decouple it from the Cilium daemon. Concretely, all the code
under daemon/cmd/hubble.go will be moved away into the new cell, along
with the flags and Hubble startup, until the point where it would be
possible to create a hive with the Hubble cell but without the Cilium
daemon.

Splitting the Hubble components into smaller cells is left for a later
time.

[1]: cilium#24828
[2]: cilium#23887
[3]: cilium#29565

Signed-off-by: Alexandre Perrin <alex@isovalent.com>
kaworu added a commit to kaworu/cilium that referenced this pull request Oct 7, 2024
This commit introduce a new Hubble cell. It is the first of a series of
refactoring patches gradually transitioning Hubble configuration and
startup code away from the Cilium daemon.

Currently, all the Hubble configuration and initialization is
implemented in the Cilium daemon. Hubble related flags (e.g.
--enable-hubble) are member of the DaemonConfig struct, and the Hubble
startup code is under daemon/cmd/hubble.go. Most notably the
launchHubble() func which handle all the Hubble component startup is
called from cmd.newDaemon().

Over time, launchHubble() has become increasingly complex, initializing
more and more Hubble subsystems while bailing out if any of them
encounter a configuration or startup error. This has lead to a situation
where a misconfiguration of one part of Hubble (e.g. metrics) could
effectively disable other Hubble components without crashing the Cilium
daemon.

The recent addition of many new Hubble feature / subsystems, for example
monitor event filtering[1], L7 data redaction[2], and k8s Events
integration on drop notifications[3] introducing new Hubble related
flags and logic both increase the potential of startup failure and the
surface of affected features.

Furthermore, most of the new feature increase Hubble's reliance on the
Cilium daemon for either flag setup/parsing and/or dependencies
injection, making it harder to decouple Hubble from the Daemon.

Fortunately, many part of Cilium already use the hive/cell framework for
configuration, startup, and dependency injection. Most notably, all
Hubble dependencies are now available as cell, which open the
possibility to convert Hubble components as cell themselves.

The goal of this patch series is to refactor Hubble into its own new
cell and decouple it from the Cilium daemon. Concretely, all the code
under daemon/cmd/hubble.go will be moved away into the new cell, along
with the flags and Hubble startup, until the point where it would be
possible to create a hive with the Hubble cell but without the Cilium
daemon.

Splitting the Hubble components into smaller cells is left for a later
time.

[1]: cilium#24828
[2]: cilium#23887
[3]: cilium#29565

Signed-off-by: Alexandre Perrin <alex@isovalent.com>
github-merge-queue bot pushed a commit that referenced this pull request Oct 9, 2024
This commit introduce a new Hubble cell. It is the first of a series of
refactoring patches gradually transitioning Hubble configuration and
startup code away from the Cilium daemon.

Currently, all the Hubble configuration and initialization is
implemented in the Cilium daemon. Hubble related flags (e.g.
--enable-hubble) are member of the DaemonConfig struct, and the Hubble
startup code is under daemon/cmd/hubble.go. Most notably the
launchHubble() func which handle all the Hubble component startup is
called from cmd.newDaemon().

Over time, launchHubble() has become increasingly complex, initializing
more and more Hubble subsystems while bailing out if any of them
encounter a configuration or startup error. This has lead to a situation
where a misconfiguration of one part of Hubble (e.g. metrics) could
effectively disable other Hubble components without crashing the Cilium
daemon.

The recent addition of many new Hubble feature / subsystems, for example
monitor event filtering[1], L7 data redaction[2], and k8s Events
integration on drop notifications[3] introducing new Hubble related
flags and logic both increase the potential of startup failure and the
surface of affected features.

Furthermore, most of the new feature increase Hubble's reliance on the
Cilium daemon for either flag setup/parsing and/or dependencies
injection, making it harder to decouple Hubble from the Daemon.

Fortunately, many part of Cilium already use the hive/cell framework for
configuration, startup, and dependency injection. Most notably, all
Hubble dependencies are now available as cell, which open the
possibility to convert Hubble components as cell themselves.

The goal of this patch series is to refactor Hubble into its own new
cell and decouple it from the Cilium daemon. Concretely, all the code
under daemon/cmd/hubble.go will be moved away into the new cell, along
with the flags and Hubble startup, until the point where it would be
possible to create a hive with the Hubble cell but without the Cilium
daemon.

Splitting the Hubble components into smaller cells is left for a later
time.

[1]: #24828
[2]: #23887
[3]: #29565

Signed-off-by: Alexandre Perrin <alex@isovalent.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kind/community-contribution This was a contribution made by a community member. release-note/minor This PR changes functionality that users may find relevant to operating Cilium.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

CFP: Emit v1.Event on packet drop

10 participants