Skip to content

Fix the issue where loopback-bound sockets attempt to connect to a public IP#3531

Merged
stevenengler merged 14 commits intoshadow:mainfrom
magnified103:fix-local-bound
Mar 30, 2025
Merged

Fix the issue where loopback-bound sockets attempt to connect to a public IP#3531
stevenengler merged 14 commits intoshadow:mainfrom
magnified103:fix-local-bound

Conversation

@magnified103
Copy link
Copy Markdown
Contributor

@magnified103 magnified103 commented Mar 12, 2025

In the current sys_connect handler, sockets bound to localhost are not handled correctly.

// a connected tcp socket must be bound
let is_bound = unsafe { c::legacysocket_isBound(socket_ref.as_legacy_socket()) } == 1;
if !is_bound {
log::trace!("Implicitly binding listener socket");
// implicit bind: bind to an ephemeral port (use default interface unless the remote

Shadow does not check cases where it connects to an external address using a socket bound to 127.0.0.1, which later causes crashes when the worker fails to resolve the localhost address to a host.

// check if network reliability forces us to 'drop' the packet
let reliability: f64 = Worker::with(|w| w.shared.reliability(src_ip, dst_ip).unwrap())
.unwrap()

The same issue also occurs with UDP (and possibly in tcp.rs).

This PR brings a fix to this issue, although I think we'd want to mimic a routing table and perform these handling there.

@github-actions github-actions bot added Component: Testing Unit and integration tests and frameworks Component: Main Composing the core Shadow executable labels Mar 12, 2025
@magnified103
Copy link
Copy Markdown
Contributor Author

The current fix was based on two-month-old commits and passed all tests at that time. I'm still investigating why the tests are failing now.

@stevenengler
Copy link
Copy Markdown
Contributor

stevenengler commented Mar 12, 2025

The current fix was based on two-month-old commits and passed all tests at that time. I'm still investigating why the tests are failing now.

In the past month, we've migrated Shadow to the rust 2024 edition, the rustfmt 2024 edition, and are building with rust 1.85 in the CI now. There have also been some CI fixes due to a breaking change in rustup that may affect older commits that don't have the CI fixes.

You might need to rebase to get the fixes, and then run rustfmt and clippy to make sure they still pass.

@stevenengler
Copy link
Copy Markdown
Contributor

Also thanks for the PR! One additional thing we'll need is some tests in src/test/socket/ to ensure it follows Linux behaviour and so that we don't accidentally break it in the future. Probably in src/test/socket/send_recv/test_send_recv.rs. We have some documentation here at https://shadow.github.io/docs/guide/writing_tests.html#system-call-tests.

(Also see #3526.)

@magnified103 magnified103 changed the title Fix the issue where localhost-bounded sockets attempt to connect to an external address Fix the issue where loopback-bounded sockets attempt to connect to an public IP Mar 13, 2025
@magnified103 magnified103 changed the title Fix the issue where loopback-bounded sockets attempt to connect to an public IP Fix the issue where loopback-bound sockets attempt to connect to an public IP Mar 13, 2025
@magnified103 magnified103 marked this pull request as ready for review March 13, 2025 09:20
@magnified103
Copy link
Copy Markdown
Contributor Author

magnified103 commented Mar 13, 2025

This PR is ready for review.

I believe the new TCP implementation also has this issue. I attempted to fix it but couldn't write a test for the fix, as adding connect-new-tcp to the CMake config already causes existing tests to fail.

Additionally, I found a minor issue: calling connect() to a public IP on a socket bound to 127.0.0.1 returns EINVAL, but I believe it should return EISCONN. The EISCONN check is currently in the C code and needs to be moved out.

/* listening sockets can't connect */
if (tcp_isValidListener(tcp)) {
return -EISCONN;
}

Do you want me to include the fix for that issue within this PR?

@magnified103 magnified103 changed the title Fix the issue where loopback-bound sockets attempt to connect to an public IP Fix the issue where loopback-bound sockets attempt to connect to a public IP Mar 13, 2025
Copy link
Copy Markdown
Contributor

@stevenengler stevenengler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fixes and for the tests. This looks great! I just have one small change request, and then it's ready to merge.

@stevenengler
Copy link
Copy Markdown
Contributor

stevenengler commented Mar 23, 2025

I believe the new TCP implementation also has this issue. I attempted to fix it but couldn't write a test for the fix, as adding connect-new-tcp to the CMake config already causes existing tests to fail.

Don't worry about the new TCP implementation. It still needs lots of work, and once we eventually enable the tests, they should fail and we'll remember to fix this there too.

Additionally, I found a minor issue: calling connect() to a public IP on a socket bound to 127.0.0.1 returns EINVAL, but I believe it should return EISCONN. The EISCONN check is currently in the C code and needs to be moved out.

/* listening sockets can't connect */
if (tcp_isValidListener(tcp)) {
return -EISCONN;
}

Do you want me to include the fix for that issue within this PR?

If there's a bug here, feel free to fix this either in this MR, or a new MR.

But I'm not clear on what the issue is here. Doesn't the new test_loopback_bound_connect test you added check this case? The test checks that this returns EINVAL for both TCP and UDP, both in shadow and outside shadow?

This PR brings a fix to this issue, although I think we'd want to mimic a routing table and perform these handling there.

Agreed. Hopefully we get there some day :)

@stevenengler
Copy link
Copy Markdown
Contributor

I'm going to merge the PR as-is, and I'll do a follow up with the small changes. I hope you don't mind. I think this PR is good and I'd like to get this in to shadow.

@magnified103
Copy link
Copy Markdown
Contributor Author

I'm going to merge the PR as-is, and I'll do a follow up with the small changes. I hope you don't mind. I think this PR is good and I'd like to get this in to shadow.

Hi, sorry, I was a bit busy last week. I'll make the necessary changes today, and it'll be good.

@stevenengler
Copy link
Copy Markdown
Contributor

I'm going to merge the PR as-is, and I'll do a follow up with the small changes. I hope you don't mind. I think this PR is good and I'd like to get this in to shadow.

Hi, sorry, I was a bit busy last week. I'll make the necessary changes today, and it'll be good.

Okay no problem, I don't mean to rush you. I just figured the changes I mentioned would be easy for me to add. But if you want to do them then that's great too, whenever you get time.

@github-actions github-actions bot added the Component: Documentation In-repository documentation, under docs/ label Mar 30, 2025
@magnified103
Copy link
Copy Markdown
Contributor Author

I've added another fix and test for loopback-bound listening sockets: sys_connect should return EISCONN regardless of whether the connecting address is loopback or not.

Can you review the changes I just made?

Copy link
Copy Markdown
Contributor

@stevenengler stevenengler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added another fix and test for loopback-bound listening sockets: sys_connect should return EISCONN regardless of whether the connecting address is loopback or not.

Can you review the changes I just made?

These look good, thanks!

@stevenengler stevenengler merged commit 24343d0 into shadow:main Mar 30, 2025
24 of 25 checks passed
stevenengler added a commit that referenced this pull request Mar 31, 2025
Previously a UDP socket that called `sendto()` would be implicitly bound
to an address on the same interface. So sending to 127.0.0.1 would cause
the socket to be bound to 127.0.0.1. Sending to 1.2.3.4 would cause the
socket to be bound to the public IP address of the host.

This PR fixes the behaviour so that regardless of the address used with
`sendto()`, the socket will always be bound to 0.0.0.0.

This, along with #3531, should fix #3526.
@magnified103 magnified103 deleted the fix-local-bound branch March 31, 2025 16:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Component: Documentation In-repository documentation, under docs/ Component: Main Composing the core Shadow executable Component: Testing Unit and integration tests and frameworks

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants