Skip to content

listener: Fix incorrect error status handling when failing to change netns#40933

Merged
tonya11en merged 5 commits intoenvoyproxy:mainfrom
tonya11en:fix-netns-result-check
Sep 23, 2025
Merged

listener: Fix incorrect error status handling when failing to change netns#40933
tonya11en merged 5 commits intoenvoyproxy:mainfrom
tonya11en:fix-netns-result-check

Conversation

@tonya11en
Copy link
Copy Markdown
Member

@tonya11en tonya11en commented Sep 2, 2025

When creating a listener socket in another network namespace, a nested absl::StatusOr<> is returned. The outer status shows the result of the attempt to switch network namespaces and the inner status is the result of the creation of a listener socket.

This patch fixes a bug where we checked the inner status when we should have been checking the outer status. This results in a segfault because the outer status is dereferenced without first checking if it was OK.

We now check the correct results and additional comments have been added to improve clarity of this code.

Risk Level: Low
Testing: New unit test and fuzz test case
Docs Changes: n/a
Release Notes: Added
Platform Specific Features: Linux only

When creating a listener socket in another network namespace, a nested
`absl::StatusOr<>` is returned. The outer status shows the result of the
attempt to switch network namespaces and the inner status is the result
of the creation of a listener socket.

This patch fixes a bug where we checked the inner status when we should
have been checking the outer status. This results in a segfault because
the outer status is dereferenced without first checking if it was OK.

We now check the correct results and additional comments have been added
to improve clarity of this code.

Signed-off-by: Tony Allen <txallen@google.com>
Signed-off-by: Tony Allen <txallen@google.com>
@tonya11en tonya11en requested a review from adisuissa September 2, 2025 19:32
@adisuissa adisuissa self-assigned this Sep 2, 2025
Copy link
Copy Markdown
Contributor

@adisuissa adisuissa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!
Overall LGTM, modulo additional test.

Signed-off-by: Tony Allen <txallen@google.com>
Signed-off-by: Tony Allen <txallen@google.com>
Signed-off-by: Tony Allen <txallen@google.com>
@tonya11en tonya11en requested a review from adisuissa September 2, 2025 22:05
Copy link
Copy Markdown
Contributor

@adisuissa adisuissa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!
Assigning a senior-maintainer as this touches core code.
/assign-from @envoyproxy/senior-maintainers

}

TEST_F(ExecInNetnsTest, FailtoReturnToOriginalNetns) {
EXPECT_DEATH(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Somewhat orthogonal to this PR: I'm not very familiar with the internal design, but I suggest considering alternatives to death in some of these cases. I assume this is not happening after the config ingestion, and maybe one can mark a listener as broken/unhealthy instead of killing the entire proxy.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This scenario is pretty bad, though. It means the actual OS thread is stuck in another network namespace and would need to be discarded. We would essentially lose the worker thread.

// that wraps the result of the function we pass in, which is another `absl::StatusOr`.
auto outer_result = Network::Utility::execInNetworkNamespace(fn, netns.value().c_str());

// We have a nested absl::StatusOr type. The "outer" result is the result of our attempt to jump
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this could've been:

RETURN_IF_NOT_OK_REF(outer_result.status());
return outer_result.value();

but I like the additional comments that explain the outer/inner relationship.

@repokitteh-read-only
Copy link
Copy Markdown

@envoyproxy/senior-maintainers assignee is @zuercher

🐱

Caused by: a #40933 (review) was submitted by @adisuissa.

see: more, trace.

Copy link
Copy Markdown
Member

@zuercher zuercher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@tonya11en tonya11en merged commit 437946f into envoyproxy:main Sep 23, 2025
25 checks passed
@tonya11en tonya11en deleted the fix-netns-result-check branch September 23, 2025 16:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants