-
Notifications
You must be signed in to change notification settings - Fork 166
[XrdHttp] Apply keepalive when redirecting HTTP clients #2290
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
abh3
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks pretty reasonable to me.; thanks @jthiltges; @ccaffy do you concurr?
|
@abh3 Looks like keeping CentOS 7 CI alive will not be so easy. Should we just drop it? I can check before releasing that everything still works on CentOS 7 and build the packages manually. |
|
Looks like there is a compiler mismatch. However, isn't it the case if we restrict ourselves to C++14 it will still work in CentOS17? If that is the case there is a trivial fix for that (I can put it in if we go this way) and hold off C++17 until the end of the year. How's that? |
|
No, the problem is that nodejs needed in CI is now broken on CentOS 7, take a look at the log of the failed build, it seems there's a problem with the version of libc it expects. |
|
Hi @jthiltges , Thanks for this contribution, for me it looks good as-is, I'll just try few things before approving. With which kind of setup did you try this ? Thanks again! |
|
Hi @ccaffy,
This PR came from investigating the uproot5 issue 1233 above. It includes a short reproducer. Our server environment has a redirector and multiple data servers. The uproot client is retrieving files from the redirector via HTTPS. The client makes many range requests in short succession, reusing established connections where possible. Each request starts with the redirector.
That's exactly it. The HTTP requests are handled by aiohttp, which keeps a connection pool. Changing XRootD to reply to redirects with Reproducing the uproot/aiohttp issue is a bit tricky due to timing issues. I was only able to reliably reproduce it when connecting from a client in the same datacenter. I'm guessing longer distances/delays allow the TCP stack to catch up and realize the connection is closed before the HTTP client reuses it. The keep-alive behavior is thankfully easier to see with curl. The client opens the connection, but finds it dead: With the patch, curl is able to reuse the connection: Regards, |
|
Thanks a lot for the nice explanations! |
ccaffy
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright, I could test HTTP/TPC PULL/PUSH and HTTP GET/PUT,HEAD requests, all good!
|
It seems that this patch does not correct handle the case where the client does not want (or support) keep-alive: the connection is left open. I'm switching the PR to draft while investigating. I put this patch into production on our local redirector. It caused one of our HTTP monitoring checks to fail. Nagios check_http (also used by checkmk) uses HTTP 1.0 for requests, which does not support keep-alive. If the server keeps the connection open, the check eventually times out. Ex. |
|
I believe I've found the problem. The xrootd/src/XrdHttp/XrdHttpReq.cc Line 2783 in 8dbcf21
Then the following In this PR, I'd followed the pattern used elsewhere. E.g. xrootd/src/XrdHttp/XrdHttpReq.cc Lines 1044 to 1045 in 8dbcf21
I'd be happy to update the PR to correct the problem. How best to solve it? Save bool tmp_keepalive = keepalive;
reset();
return tmp_keepalive; |
|
It looks like using a temp variable is the simplest and thus best solution here. Anyway, the compiler will elide the variable. |
|
Oh very nice catch! OK then please update this PR with that fix and add a comment explaining why you have this temporary variable! Thanks again for this work! |
5761d2e to
389c797
Compare
|
@jthiltges Please push at least one tag (e.g. with |
Thank you. I'm hoping it's sorted out. Closing this PR and reopening to hopefully kick off the CI again. |
amadio
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's fine as is, but I suggested a small improvement. Please consider it. Thank you!
Leave connection open after redirection if keepalive is enabled
The reset function prepares the object for another request, including clearing the keepalive variable. Save the keepalive value when it's needed as a return value.
Co-authored-by: Guilherme Amadio <amadio@cern.ch>
dcfb2bf to
8ca30d6
Compare
Leave connection open after redirection if keepalive is enabled
At present, XRootD immediately closes HTTP connections when replying with a redirection, while still indicating keep-alive support in the reply. This is causing occasional problems for some HTTP clients, as they try to immediately reuse the connection, before the connection closure has been processed.
This change resolves the issue in local testing, but a thoughtful review would be appreciated to ensure connection reuse won't add other issues.