Skip to content

Conversation

@smithdh
Copy link
Contributor

@smithdh smithdh commented Jun 19, 2025

No description provided.

@smithdh
Copy link
Contributor Author

smithdh commented Jun 19, 2025

This PR is to change a current behavior in XrdCl concerning redirects that make use of the kXR_collapseRedir facility:

The intent of the kXR_collapseRedir feature is that a hostname like 'xrootd-prod.dom' which might resolve to e.g. xrootd1.dom and xrootd2.dom, and will cause the client to connect to one of those. e.g. xrootd2.dom. That host could indicate in a redirect message that for the duration of the client's Channel for 'xrootd-prod.dom' it should refer to a conneciton to one of the other hosts, e.g. 'xrootd1.dom'. (This might be part of the operaiton of a HA setup).

Currently the client will replace the channel for 'xrootd-prod.dom' with one definitly for 'xrootd1.dom' as expected, and the old channel will become idle, and eventually timeout with TTL, also as expected. However currently that timeout will cleanup the replacement channel (which may be in use), and infact since the idle channel is not actually destroyed the timeout will repeate for every subsequent timeout check interval.

This PR as currently written arguably adds some techincal debt; this PR makes PostMaster::ForceDisconnect change bevahior depending on &url. A better alternative appears to be adding a Channel* argument to ForceDisconnect (but to avoid breaking ABI of the public 'PostMaster' class, that would probably need to be an additional ForceDisconnect() method, there are already two. And XrdStream may need a new member to contain XrdChannel*, so that it can be passed at the point it calls ForceDisconnect). Any suggesiton to improve that is welcome. Or the approach in this PR could be adopted for now, and improved at release 6.

@amadio
Copy link
Member

amadio commented Jun 19, 2025

@smithdh Could you please add the description above to the commit message? Thank you!

@smithdh smithdh force-pushed the ds_redircollapse_idleout branch from f875d06 to c8f1024 Compare June 20, 2025 09:29
@smithdh smithdh force-pushed the ds_redircollapse_idleout branch from c8f1024 to 698b338 Compare June 23, 2025 07:03
@smithdh smithdh marked this pull request as ready for review June 23, 2025 07:16
@smithdh
Copy link
Contributor Author

smithdh commented Jun 23, 2025

hello, I added the commit message.

…ement

This commit is to change behavior in XrdCl concerning redirects that make use of the kXR_collapseRedir facility:

The intent of the kXR_collapseRedir feature is that a hostname like 'xrootd-prod.dom' might resolve to multiple addresses,
e.g. for xrootd1.dom and xrootd2.dom, and will cause the client to connect to one of those. e.g. xrootd2.dom.
That host could indicate in a redirect message using collapseRedir that for the duration of the client's Channel
for 'xrootd-prod.dom' the connection should refer to one of the other hosts, e.g. 'xrootd1.dom'.
This might be part of the working of an HA setup.

Currently the client will replace the channel for 'xrootd-prod.dom' with one for 'xrootd1.dom', as expected and
the old channel will become idle and eventually timeout with TTL, also as expected. However currently that timeout
will cleanup the replacement channel (which may be in use) and since the idle channel is not destroyed the timeout
will repeat for every subsequent timeout check interval.

This commit arguably adds some technical debt; it makes PostMaster::ForceDisconnect change behavior depending on &url.
A better alternative appears to be adding a Channel* argument to ForceDisconnect (but to avoid breaking ABI of the
public 'PostMaster' class, that would probably need to be an additional ForceDisconnect() method, there are already two.
And XrdStream may need a new member to contain XrdChannel*, so that it can be passed at the point it calls ForceDisconnect).
It may be worthwhile to revise this at a point where breaking ABI, e.g. for release 6.
@amadio amadio added this to the 5.8.4 milestone Jun 23, 2025
@amadio amadio force-pushed the ds_redircollapse_idleout branch from 698b338 to bb40de4 Compare June 23, 2025 14:22
@amadio amadio changed the base branch from devel to master June 23, 2025 14:22
@amadio
Copy link
Member

amadio commented Jun 23, 2025

ABI check was fine, so I'm merging this directly into master.

@amadio amadio merged commit bb40de4 into xrootd:master Jun 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants