Support socks5h proxy using unix sockets#8668
Conversation
bagder
left a comment
There was a problem hiding this comment.
For tests, I think test case 1435 can serve as good inspiration as it already uses --unix-socket
lib/url.c
Outdated
There was a problem hiding this comment.
Is there any reason to limit this to SOCKS5_HOSTNAME and not allow SOCKS4a too?
There was a problem hiding this comment.
I thought it may not work as dns resolution is done locally in SOCKS4a using the same function. But after quick test, looks like it should work. I will update for all socks proxies.
There was a problem hiding this comment.
Updated to support all socks proxies and not depend on the scheme prefix in proxy string. Please confirm if implementation/syntax is ok. I will then add a test.
lib/url.c
Outdated
There was a problem hiding this comment.
The host name might start with that scheme, the type could also be set with a separate option and then the proxy name starts with the host name. This cannot assume the first 10 bytes are known.
I also find it a little unfortunate to reserve the unix host name like this, as there might actually be a proxy in use that is named like that. Maybe we should instead make the --unix-socket option accept a socks... prefix?
There was a problem hiding this comment.
The host name might start with that scheme, the type could also be set with a separate option and then the proxy name starts with the host name. This cannot assume the first 10 bytes are known.
I will update to handle that case.
there might actually be a proxy in use that is named like that.
In that case, unix/path cannot be a valid URI for a socks proxy as it can only have host:port syntax. Since I am checking for unix/, I think it should not affect anyone having a server named unix with a socks proxy running in it.
Maybe we should instead make the --unix-socket option accept a socks... prefix
I wish to have the ability to set the proxy via environment variable like ALL_PROXY=socks5h://unix/run/tor/socks, so that programs that internally use curl does not have to change the command line.
|
How about a syntax like this:
ALL_PROXY=socks5h:///unix/run/tor/socks
i.e. an extra slash, making the host part of the URL empty. This is similar to
how a file: URL would refer to the same file.
|
Unfortunately, the URL parser accepts one, two or three slashes where there should be two so three slashes will be accepted as if they were two: Another option is to use a short but otherwise illegal host name. For example just two dots ( |
|
On Sat, Apr 02, 2022 at 02:55:00PM -0700, Daniel Stenberg wrote:
Unfortunately, the URL parser accepts one, two or three slashes where there
should be two so three slashes will be accepted as if they were two:
That sounds wrong—it goes against RFC 1738 and RFC 8089, anyway. Is this a
WHATWG thing?
|
Yes. 😢 The lax nature of WHATWG's URL spec makes such URLs occasionally get used in the wild, so this is our adjustment to accept such URLs better.The WHATWG spec actually allows any amount of slashes (even zero), and both backslashes and slashes (mixed) but curl only allows regular forward slashes and only one, two or three of them. |
I looked at the test setup. I think the following changes are required along with writing a testcase.
Q: Where should the temporary socket be created and how/when should it be cleaned up? With last update, If the hostname is Please correct if I am missing anything. |
3a6e94b to
9c68b8e
Compare
|
@bagder Please take a look again.
I don't understand the torture test failure but it does not fail consistently. |
|
The torture test fail of 1467 means that there is a memory/resource leak somewhere in the exit path. |
bagder
left a comment
There was a problem hiding this comment.
There is no added documentation?
Reviewed again. Could not find the leak.. Can you please share instructions to reproduce failure locally? I am on a ubuntu desktop. Added documentation for the command line flags. |
|
The clue is torture tests. It's a special way to run the tests, which verifies the exit paths heavily.
This then makes all allocations in the test fail, one by one and it verifies that it never leaks memory in any case. The 1467 failure you got looked like this: It shows that escape.c:144 allocated a line that was never freed. It was when fallible function number 93 was made to fail. If you wanted reproduce that exact fail number, you can do this: I usually first try with -t to run all loops torture to see that it reproduces, it might not be the exact same number in a local build. Also, use -n to switch off valgrind since running with valgrind makes every loop take several times longer, which can make this test really slow. Using valgrind in the last step when you re-run a single number is very useful because valgrind can help you pinpoint the entire leak call stack much better. |
|
if you still have problems rerunning the test, I can have a go at it soon. |
Thank you. Was able to reproduce and fix it. Waiting for CI run. |
|
Fixed torture test and added documentation. Please review. curl/check failure does not show any logs or errors except for message "Build Failed". Please help understand the error. |
bagder
left a comment
There was a problem hiding this comment.
This looks very good and I'm ready to merge. Just one thing left that I can think of:
You added docs for the command line tool but not for the libcurl option that actually passes the string to libcurl. CURLOPT_PROXY. It needs to be similarly extended so that users can learn how to specify a socks proxy over unix domain socket with it.
Usage:
curl -x "socks5h://localhost/run/tor/socks" "https://example.com"
Updated runtests.pl to run a socksd server listening on unix socket
Added tests test1467 test1468
Added documentation for proxy command line option and socks proxy
options
Thank you. Updated CURLOPT_PROXY documentation. |
|
Thanks! |
Usage:
Ref: https://curl.se/mail/archive-2021-03/0012.html
This is a proof of concept to make sure it is working. Please take over from here or guide next steps - tests, documentation etc.,