Skip to content

Conversation

@bbockelm
Copy link
Collaborator

This is a first attempt at "fixing" #2299. Instead of invoking a filesystem stat, then opening the path, it changes the GET code to always try the 'open' first.

If the open fails with an kXR_isDirectory (i.e., EISDIR), then the code falls back to doing a directory listing.

At the minimum, this will halve the number of stat requests the underlying storage will see as stat is done twice per request if the requests are sent sequentially. At best, it'll eliminate nearly all stats under high concurrency given the OFS will collapse the open files into a single handle and some OSS's will cache the results of the first fstat.

This state machine for GET is notoriously tricky so I'd appreciate a review by @ccaffy or @smithdh. Tests done by hand:

  • Download full file.
  • Download file in single Range request.
  • Download file in multiple Range requests.
  • GET on a directory.

@bbockelm
Copy link
Collaborator Author

@amadio - I see the following unit test fails:

  79: Test command: /__w/xrootd/xrootd/build/tests/XrdEc/xrdec-unit-tests "--gtest_filter=XrdEcTests.IllegalVectorReadTest" "--gtest_also_run_disabled_tests"
  79: Working Directory: /__w/xrootd/xrootd/build/tests/XrdEc
  79: Test timeout computed to be: 600
  79: Running main() from gtest_main.cc
  79: Note: Google Test filter = XrdEcTests.IllegalVectorReadTest
  79: [==========] Running 1 test from 1 test case.
  79: [----------] Global test environment set-up.
  79: [----------] 1 test from XrdEcTests
  79: [ RUN      ] XrdEcTests.IllegalVectorReadTest
   40/108 Test  #79: XrdEc::XrdEcTests.IllegalVectorReadTest ...................................***Exception: SegFault  0.20 sec

Is that expected to succeed? I'm having a hard time figuring out how changes exclusively in XrdHttp could cause this :/

@smithdh
Copy link
Contributor

smithdh commented Jul 24, 2024

Hi. I can reproduce an occasional segfault during XrdEcTests.IllegalVectorReadTest, e.g. using devel branch without this PR. So indeed it seems the crash is probably not related to this http change. (I'd have to confirm with Guilherme if he knows about any existing outstanding issue with the test).

@amadio
Copy link
Member

amadio commented Jul 25, 2024

@bbockelm @smithdh I confirm that this test fails sporadically in the master branch. I've looked at it before, but didn't manage to identify the root cause yet. It's a lifetime management issue of Zip archive objects. Now I'm back from vacations and will take a look at this and other issues. Feel free to ignore for now.

@ffurano
Copy link
Contributor

ffurano commented Sep 3, 2024

Hi! I'll be happy to review the change, which seems to make sense to me. I would appreciate a rebase though. Brian, could you please do it ?

@ffurano
Copy link
Contributor

ffurano commented Sep 12, 2024

I managed to make the directory listing hang with a sufficiently large directory

GET / HTTP/1.1
User-Agent: libdavix/0.8.5 neon/0.0.29
Keep-Alive:
Connection: Keep-Alive
TE: trailers
Host: littlexrdhttp.cern.ch:1094

Server just says:
240912 10:03:44 79607 XrootdBridge: unknown.149234:27@pcitsdcfab.dyndns login as nobody
240912 10:03:44 79607 ofs_open: unknown.149234:27@pcitsdcfab.dyndns Unable to open /; is a directory
240912 10:04:38 79595 XrootdXeq: unknown.149234:27@pcitsdcfab.dyndns disc 0:00:54

@smithdh
Copy link
Contributor

smithdh commented Sep 13, 2024

H. (Adding a comment as I'd said something in person during the workshop, of a possible reason for Fabrizio's observation of hang) - I had in mind an existing bad behaviour during header parsing (my last comment on #2072 ). However reading Fabrizio's description of the problem, that issue may not be relevant.

@bbockelm
Copy link
Collaborator Author

@ffurano -- for recreating the failure you observe, what's the approximate size of directory for your directory listing?

@ffurano
Copy link
Contributor

ffurano commented Sep 13, 2024

mine is 50k entries, and interestingly enough it enters a dodgy CPU-eating mode. However I think I have found the fix, it will likely come today after I finish testing.

@bbockelm
Copy link
Collaborator Author

FWIW -- I created 100k files in a directory doing the following:

for num in {1..100000}; do echo "hello world $num" > /tmp/xrootd/public/listdir/hello.txt.listdir.$num; done

and then invoked

for num in {1..10}; do curl -v -H 'depth: 1' -X PROPFIND http://localhost:59906/public/listdir/ > /dev/null; done

I didn't see any obvious failures. I did notice that the PROPFIND response buffers the entire 50MB response string in memory until all 100k entries were processed, then sent it at once. So, if you're doing something like a directory with 4M files in it, you need at least 20GB of memory for the response. (That's something that seems worth fixing but outside the purview of this particular PR.)

@smithdh mentioned he has previously seen something like this so I'm summoning him on this ticket.

@ffurano
Copy link
Contributor

ffurano commented Sep 13, 2024

The issue appears with GET, not PROPFIND

@ffurano
Copy link
Contributor

ffurano commented Sep 13, 2024

The fix for this hang is in
bbockelm#8

In general I like the idea behind this PR very much, and I like also the refactoring bits that it comes with. I'll test a bit further

@bbockelm
Copy link
Collaborator Author

The latest force-push includes @ffurano's fix for directory listing.

Copy link
Contributor

@ffurano ffurano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To me the thing works and could provide a measurable benefit in some use cases.

Since the mod is based on a correct kXR_isDir behaviour, it may be subjected to the correct implementation of other Xrootd plugins. I would suggest someone else to test this with an xrootd in caching mode, whose behaviour and use cases I don't entirely master.

@amadio
Copy link
Member

amadio commented Sep 17, 2024

@osschar Would you mind giving this a go when you have time?

@bbockelm
Copy link
Collaborator Author

bbockelm commented Oct 3, 2024

@osschar - if you want an endpoint to test, this patch should be on all the OSDF cache servers (I believe Diego has CMS files present if you want CMS files to test). Alternatively, I can give you an RPM if you'd like.

@bbockelm
Copy link
Collaborator Author

FWIW -- I've been exclusively testing and running this branch on all the OSDF caches (so, around 1PB a day of data delivery). I think it's in pretty good shape for that.

@amadio
Copy link
Member

amadio commented Oct 25, 2024

Yes, the consensus on our side is that this is ready too. I just need to have the time to review it before merging. I will likely merge in the coming week. Thanks!

Copy link
Member

@amadio amadio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bbockelm Thanks for these changes, they look good. I'd like to request a few changes, however, in how the commits are structured. I think it makes sense to put first a commit adding sendFooterError and refactoring current code to use it where it makes sense, then add another commit refactoring the listing into the newly added PostProcessListing (including @ffurano's fix in the initial version, and with a Reviewed-by or Co-authored-by tag to acknowledge it). Then the commit refactoring the handling of GET requests to avoid calling stat. This will make it much easier to understand later. Thank you!

if (!prot->Bridge->Run((char *) &xrdreq, (char *) res.c_str(), l)) {
prot->SendSimpleResp(404, NULL, NULL, (char *) "Could not run request.", 0, false);
return -1;
XrdHttpReq::mapXrdErrorToHttpStatus();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use just mapXrdErrorToHttpStatus(); here, as we are inside XrdHttpReq anyway.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in the latest commit (plus other, similar locations).

{
// Ugly hack. Be careful with EOS! Test with vanilla XrdHTTP and EOS, separately
// A 404 on the preliminary stat() is fatal only
// in a manager. A non-manager will ignore the result and try anyway to open the file
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ccaffy I wonder if the changes introduced here will affect EOS, do you know? If that's possible, we should test this with EOS to make sure nothing there is relying on the previous behavior of stat + (open | dirlist).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would indeed appreciate a test from the EOS side as well!

@abh3
Copy link
Member

abh3 commented Nov 2, 2024

Can we get this pull request resolved? It is now blocking work on pull request 2348 as any changes in that one rely on the changes in this pull request (i.e. returning the correct "AGE" response). It's not clear what all the holdups are, one certainly is that @bbockelm needs to respond to some change requests the other is a concern from @amadio, that said can we get some traction on this?

@amadio amadio linked an issue Nov 6, 2024 that may be closed by this pull request
bbockelm and others added 3 commits November 10, 2024 18:14
The original GET implementation always started with a standalone `stat`
call - whose results were only used to determine whether to invoke
the directory listing code.

This changes the GET to try to open the path by default and, if it
is either a directory descriptor (or fails with EISDIR), then try
doing the directory listing.

By avoiding the standalone stat, we significantly reduce the number
of filesystem calls done under high concurrency (as the OFS code will
collapse all the open files into a single file handle).
Sending an error message is only possible if footer / trailers have
been requested.
Now that the footer code is in a common helper function, use that
instead of duplicating it inline.
@amadio
Copy link
Member

amadio commented Nov 11, 2024

I'm ok with merging the current version of the code. However, in its current state I'd squash everything into a single commit. If you prefer to keep separate commits for this, please refactor as I suggested above, or let me know and I will squash and merge into devel. Thanks.

@bbockelm
Copy link
Collaborator Author

I'm OK with squashing!

(and it'll probably help with the timeline -- I'm hosting a workshop this week so my next technical context switch is potentially a week or two out)

@amadio amadio merged commit dfb6959 into xrootd:devel Nov 14, 2024
amadio pushed a commit to amadio/xrootd that referenced this pull request Nov 14, 2024
Instead of invoking a filesystem 'stat', then opening the path,
change the GET code to always try the 'open' first.

If the open fails with an kXR_isDirectory (i.e., `EISDIR`), then 
the code falls back to doing a directory listing.

At a minimum, this will halve the number of 'stat' requests the
underlying storage will see as 'stat' is done twice per request if 
the requests are sent sequentially. At best, it'll eliminate nearly 
all stat calls under high concurrency given the OFS will collapse 
the open files into a single handle and some OSS's will cache the 
results of the first fstat.

Co-authored-by: Fabrizio Furano <fabrizio.furano@gmail.com>
amadio pushed a commit to amadio/xrootd that referenced this pull request Nov 15, 2024
Instead of invoking a filesystem 'stat', then opening the path,
change the GET code to always try the 'open' first.

If the open fails with an kXR_isDirectory (i.e., `EISDIR`), then 
the code falls back to doing a directory listing.

At a minimum, this will halve the number of 'stat' requests the
underlying storage will see as 'stat' is done twice per request if 
the requests are sent sequentially. At best, it'll eliminate nearly 
all stat calls under high concurrency given the OFS will collapse 
the open files into a single handle and some OSS's will cache the 
results of the first fstat.

Co-authored-by: Fabrizio Furano <fabrizio.furano@gmail.com>
amadio pushed a commit to amadio/xrootd that referenced this pull request Nov 20, 2024
Instead of invoking a filesystem 'stat', then opening the path,
change the GET code to always try the 'open' first.

If the open fails with an kXR_isDirectory (i.e., `EISDIR`), then 
the code falls back to doing a directory listing.

At a minimum, this will halve the number of 'stat' requests the
underlying storage will see as 'stat' is done twice per request if 
the requests are sent sequentially. At best, it'll eliminate nearly 
all stat calls under high concurrency given the OFS will collapse 
the open files into a single handle and some OSS's will cache the 
results of the first fstat.

Co-authored-by: Fabrizio Furano <fabrizio.furano@gmail.com>
amadio pushed a commit to amadio/xrootd that referenced this pull request Nov 20, 2024
Instead of invoking a filesystem 'stat', then opening the path,
change the GET code to always try the 'open' first.

If the open fails with an kXR_isDirectory (i.e., `EISDIR`), then 
the code falls back to doing a directory listing.

At a minimum, this will halve the number of 'stat' requests the
underlying storage will see as 'stat' is done twice per request if 
the requests are sent sequentially. At best, it'll eliminate nearly 
all stat calls under high concurrency given the OFS will collapse 
the open files into a single handle and some OSS's will cache the 
results of the first fstat.

Co-authored-by: Fabrizio Furano <fabrizio.furano@gmail.com>
amadio pushed a commit to amadio/xrootd that referenced this pull request Nov 20, 2024
Instead of invoking a filesystem 'stat', then opening the path,
change the GET code to always try the 'open' first.

If the open fails with an kXR_isDirectory (i.e., `EISDIR`), then 
the code falls back to doing a directory listing.

At a minimum, this will halve the number of 'stat' requests the
underlying storage will see as 'stat' is done twice per request if 
the requests are sent sequentially. At best, it'll eliminate nearly 
all stat calls under high concurrency given the OFS will collapse 
the open files into a single handle and some OSS's will cache the 
results of the first fstat.

Co-authored-by: Fabrizio Furano <fabrizio.furano@gmail.com>
@amadio amadio added this to the 5.7.2 milestone Nov 20, 2024
amadio pushed a commit to amadio/xrootd that referenced this pull request Nov 21, 2024
Instead of invoking a filesystem 'stat', then opening the path,
change the GET code to always try the 'open' first.

If the open fails with an kXR_isDirectory (i.e., `EISDIR`), then 
the code falls back to doing a directory listing.

At a minimum, this will halve the number of 'stat' requests the
underlying storage will see as 'stat' is done twice per request if 
the requests are sent sequentially. At best, it'll eliminate nearly 
all stat calls under high concurrency given the OFS will collapse 
the open files into a single handle and some OSS's will cache the 
results of the first fstat.

Co-authored-by: Fabrizio Furano <fabrizio.furano@gmail.com>
amadio pushed a commit that referenced this pull request Nov 21, 2024
Instead of invoking a filesystem 'stat', then opening the path,
change the GET code to always try the 'open' first.

If the open fails with an kXR_isDirectory (i.e., `EISDIR`), then 
the code falls back to doing a directory listing.

At a minimum, this will halve the number of 'stat' requests the
underlying storage will see as 'stat' is done twice per request if 
the requests are sent sequentially. At best, it'll eliminate nearly 
all stat calls under high concurrency given the OFS will collapse 
the open files into a single handle and some OSS's will cache the 
results of the first fstat.

Co-authored-by: Fabrizio Furano <fabrizio.furano@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Invalid chunk framing for HTTP Thousands of unnecessary stat calls from XrdHttp

6 participants