Skip to content

Fix consistent udp packet loss after the proxy read loop stopped#393

Merged
openshift-merge-bot[bot] merged 1 commit intocontainers:mainfrom
fatanugraha:main
Sep 18, 2024
Merged

Fix consistent udp packet loss after the proxy read loop stopped#393
openshift-merge-bot[bot] merged 1 commit intocontainers:mainfrom
fatanugraha:main

Conversation

@fatanugraha
Copy link
Copy Markdown
Contributor

@fatanugraha fatanugraha commented Sep 3, 2024

Currently we never close the tcpip.Endpoint that we created when we get *udp.ForwarderRequest. This causes all packets that is sent by the same src ip:port after we return from the UDPProxy.Run to be "dropped".

By closing the endpoint, we will get new forwarder request after we return from UDPProxy.Run so we can process new packets.

Here's my reproduction code:

  1. Reuse the same local address when sending udp requests
  2. Send one DNS request (success)
  3. wait until UDPProxy.Run to return (after 90s)
  4. Send one DNS request (failed)
package main

import (
	"context"
	"fmt"
	"net"
	"time"
)

func main() {
	r := &net.Resolver{
		PreferGo: true,
		Dial: func(ctx context.Context, network, address string) (net.Conn, error) {
			addr, err := net.ResolveUDPAddr("udp", "192.168.5.1:40001")
			if err != nil {
				panic(err)
			}

			d := net.Dialer{
				Timeout:   time.Millisecond * time.Duration(10000),
				KeepAlive: -1,
				LocalAddr: addr,
			}

			conn, err := d.DialContext(ctx, network, "8.8.8.8:53")
			if err != nil {
				panic(err)
			}

			return conn, err
		},
	}

	lookup := func() {
		_, err := r.LookupIP(context.Background(), "ip4", "www.google.com")
		if err != nil {
			fmt.Println("err", err)
		} else {
			fmt.Println("ok")
		}
	}

	lookup()                     // ok
	time.Sleep(95 * time.Second) // wait for the UDPConnTimeout
	lookup()                     // this will fail
}

@fatanugraha
Copy link
Copy Markdown
Contributor Author

/assign cfergeau

@fatanugraha
Copy link
Copy Markdown
Contributor Author

/cc baude cfergeau

@openshift-ci openshift-ci bot requested review from baude and cfergeau September 9, 2024 06:02
@evidolob
Copy link
Copy Markdown
Collaborator

@fatanugraha I was trying to test this PR. I try to run test that you provided and it works fine(I don't get any errors, just two ok). I try that on macOS and fedora 40.
So I was wondering is I missing something?

@fatanugraha
Copy link
Copy Markdown
Contributor Author

fatanugraha commented Sep 15, 2024

Hi @evidolob I've put more detailed reproduction steps here: https://github.com/fatanugraha/gvisor-tap-proxy-393

Do let me know if you have further questions 🙇

attached debug logs from gvproxy (notice that the dns query from the same local addr starts failing after this log is printed DEBU[0122] Stopping udp proxy (read udp 8.8.8.8:53: i/o timeout)

gvproxy.log

capture.pcap.zip

Screenshot 2024-09-15 at 23 28 04

@evidolob evidolob self-requested a review September 17, 2024 11:38
@evidolob
Copy link
Copy Markdown
Collaborator

@cfergeau I can verify that problem described in this PR description exist, and PR indeed fix it.

Signed-off-by: Fata Nugraha <fatanugraha@outlook.com>
@cfergeau
Copy link
Copy Markdown
Collaborator

I forced pushed to the branch to fix a few typos in the comment.
/lgtm
/approve

@openshift-ci openshift-ci bot added the lgtm label Sep 18, 2024
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci bot commented Sep 18, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cfergeau, evidolob, fatanugraha

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-merge-bot openshift-merge-bot bot merged commit ded1408 into containers:main Sep 18, 2024
@cfergeau
Copy link
Copy Markdown
Collaborator

I wonder if this PR could help with #387 ? (dropping a note here as I can't test/look closely now)

@cfergeau
Copy link
Copy Markdown
Collaborator

I wonder if this PR could help with #387 ? (dropping a note here as I can't test/look closely now)

Yevhen tested this, and this does not help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants