ENH: interpolate: fix concurrency issues throughout#20671
Conversation
scipy/interpolate/_interpolate.py
Outdated
| with self._c_lock: | ||
| pass | ||
|
|
There was a problem hiding this comment.
This is an optimistic-lock; it expects that multiple threads to be calling non-mutating functions most of the time, allowing them to run concurrently. It behaves like a turnstile, allowing threads that arrived first to pass, and then blocks if another thread calls extend, until it finishes updating the polynomial coefficients.
While this solution may yield dirty-reads for threads that execute before extend, it allows for better concurrency by not serializing each call that depends on the coefficients, which would slowdown the overall execution time when multiple threads are calling instructions different from extend.
An alternative would be to signal in the docs that the API is not thread-safe with respect to extend and let the user handle the concurrency explicitly when a mixture of extend and other calls happen
There was a problem hiding this comment.
Not sure I understand this comment, and most certainly a comment is needed not only on GH but also in code.
Also, if it's something to do with the extend method, can the lock move to extend? If not, why it's only in __call__ but not in other functions (.derivative, .integrate etc)
There was a problem hiding this comment.
This one still confuses me, I've to admit. Let me try rephrasing my confusion:
-
This with-stanza acquires and releases the lock, so that if two threads arrive simultaneously, one arrives to this line and the other is doing something else, that other one is stopped and waits until the lock is released -- -is this correct? If it is, why is it needed, why is it useful and how does this help anything?
-
The very need for this is because of existence of
.extendmethod which mutates the class state, is this correct? If correct,cannot one lock the.extendinternals instead? -
What is this pattern, is it something known/standard or is it something you conjured out of thin air?
-
Can you actually show a scenario where this is needed? And explain how it works in this scenario?
Previous discussion seems to suggest something rather esoteric, with threaded use of reads and.extend--- a concrete scenario would really help.
There was a problem hiding this comment.
As I mentioned in the first comment, this lock acts like a "turnstile with a block"; imagine n persons entering an amusement park through a turnstile, once they are in, they are independent of each other, despite entering in a serialized fashion. Now, imagine there's a service mode that the turnstile can enter in which it blocks while it goes maintenance, during this time no people can enter the park, until the service mode gets released.
This is the same scenario posed here, what's behind the wall is the code that can be run by each thread independent of each other, assuming that the polynomial coefficients haven't changed at all. What the so-called "service mode" in the analogy does is changing the coefficients while stopping everyone else from accessing them.
When the polynomial undergoes modifications (via extend), any thread that has already got in, will get a dirty read if they are trying to access the polynomial coefficients, that's why is called optimistic in principle, you'd expect that the number of calls to extend be way much lower than the calls to the actual interpolation function. In contrast, a pessimistic approach would be to block and serialize any call to the interpolation function, given that at any time a call to extend would lead to dirty reads for any thread.
While the pessimistic approach guarantees clean reads and consistency, it comes at the expense of performance, since the calls performed by n threads would be effectively serialized. The optimistic approach tries to balance performance at the expense of potential dirty reads that are expected to be low, since the number of calls to extend is expected to occur in a non-concurrent fashion or in few occurrences.
This is more of a thought experiment that can be an exotic scenario in the practice, that's why I posed the question about signaling in the docs that calls to extend should not be interleaved with concurrent calls to any of the interpolation subclasses that inherit from _PPolyBase, since it is not thread-safe.
There was a problem hiding this comment.
Okay, the turnstile analogy I get, thank you. That the implementation looks bizarre can be attributed to me not being used to this, so can discount that :-).
What I'm not sure about is the status of this:
- this is a best-effort trick, to remedy some of free-threading usages, correct?
- (if yes) is there a way to estimate how exotic should a user load be for this trick to not work?
Let's consider an example: https://docs.scipy.org/doc/scipy/tutorial/interpolate/extrapolation_examples.html#cubicspline-extend-the-boundary-conditions
- if we keep land this PR as is, how would the example need changing?
- if we land the PR without this turnstile, how would the example need changing?
There was a problem hiding this comment.
This mutation goes against the guidelines declared on https://github.com/Quansight-Labs/free-threaded-compatibility?tab=readme-ov-file#suggested-plan-of-attack, also SciPy should not anticipate exotic use cases like the one that initially motivated this whole "optimistic lock" approach.
The resolution in this case was to document the extend function and mark it as not thread-safe when used with interleaved interpolation methods.
061aa6f to
7edd025
Compare
|
@andfoy this looks like a good initiative. Let me know when you require someone to review. |
|
Thanks @Kai-Striega, I'll expect to complete the overview by next week, however, feel free to comment and review on any of the current changes |
|
The case of
Question: who's responsible for locking, 1) the library or 2) a user? More specifically, if there's a threading issue in user code, where the bug is: in the library or in the userland? If 1), then every library in the ecosystem needs to add locking to all mutating methods. Thus any threading violation is valid bug report which we ought to fix. ISTM there are two questions here, in fact. The first one is "who's responsible for locking" and the answer is, I strongly suspect is "it depends". |
Not really - the library is responsible. Wherever we do something that isn't thread-safe, it requires a lock. That was already true before free-threaded CPython (e.g. we had a lock around
For explicitly mutating methods (which should be very rare in SciPy), then of course the user has to be careful - doing that in multiple threads is not going to happen in a predefined order. I'm not sure why you're asking though - can you give an example of a SciPy function/method that is relevant here? |
The original comment ( motivated by an offline discussion with @andfoy ) was about Basically, this PR adds locks all around. This guards against all sorts of contrived scenarios. And I'm talking about exactly that: what are the lengths we want to go to guard against contrived usage. |
|
The Alpine test segfault is fixed if |
That's done now at long last:) |
faec9f3 to
4c7fb19
Compare
46ef6a3 to
a593b37
Compare
|
@andfoy is this still draft, or ready to go? |
|
This is missing a minor test before review |
a593b37 to
8a35122
Compare
|
@ev-br, this one is ready for review! |
ev-br
left a comment
There was a problem hiding this comment.
This looks very thorough.
However as long as we're to maintain this, there's a bunch of threadsafety noob questions, below.
scipy/interpolate/_interpolate.py
Outdated
| with self._c_lock: | ||
| pass | ||
|
|
There was a problem hiding this comment.
Not sure I understand this comment, and most certainly a comment is needed not only on GH but also in code.
Also, if it's something to do with the extend method, can the lock move to extend? If not, why it's only in __call__ but not in other functions (.derivative, .integrate etc)
3d323b1 to
7b37d77
Compare
ev-br
left a comment
There was a problem hiding this comment.
(Am flagging it as "changes requested" partly to avoid this slipping in everybody's TODO lists)
FWIW, I'm still confused by the "optimistic lock" stanza. Would really appreciate some explanation. And if we're to maintain it, some explanations are really needed to stay code comments.
Also would be great to document some about the use of the test helper --- which looks great, just needs some more docs.
scipy/interpolate/_interpolate.py
Outdated
| with self._c_lock: | ||
| pass | ||
|
|
There was a problem hiding this comment.
This one still confuses me, I've to admit. Let me try rephrasing my confusion:
-
This with-stanza acquires and releases the lock, so that if two threads arrive simultaneously, one arrives to this line and the other is doing something else, that other one is stopped and waits until the lock is released -- -is this correct? If it is, why is it needed, why is it useful and how does this help anything?
-
The very need for this is because of existence of
.extendmethod which mutates the class state, is this correct? If correct,cannot one lock the.extendinternals instead? -
What is this pattern, is it something known/standard or is it something you conjured out of thin air?
-
Can you actually show a scenario where this is needed? And explain how it works in this scenario?
Previous discussion seems to suggest something rather esoteric, with threaded use of reads and.extend--- a concrete scenario would really help.
98b0f70 to
9ea9c0b
Compare
|
This now LGTM, modulo the last small ask: could you squash the commits somewhat? Esp the debugging back-and-forth on alpine etc. |
9ea9c0b to
3140dca
Compare
|
Can you also remove the change to |
4f4c078 to
268e03a
Compare
Reference issue
See gh-20669
What does this implement/fix?
This PR will add concurrency tests for the main interpolation classes in SciPy. This is done in order to potentially discover any concurrency, race conditions or deadlocks that may appear when using multithreaded Python.
Additional information
BSpline: State is initialized in constructor, it shouldn't have any concurrency issues at first glance.PPoly(and family): State (aka polynomial coefficients) can mutate underextendmethod, all other calls depend on the stateRbfInterpolator: There are no functions that can mutate an instance state, thus is thread-safe, as longKDTreeis safe: ENH: spatial: ensure thread-safety for KDTree #20655Rbf: There are no functions that can mutate an instance state.NdBSpline: State is initialized in constructor, no concurrency issues foundNdPPoly: State is initialized in constructor, no concurrency issues foundNearestNDInterpolator: State is initialized in constructor, no mutation methods are available. It depends onKDTreebeing thread-safe: ENH: spatial: ensure thread-safety for KDTree #20655RegularGridInterpolator: A wrappedNdBSplineinstance can mutate during the call, introducing unwanted side effects on a concurrent scenario. Fixed on 7eb6d59LinearNDInterpolator/CloughTocherInterpolator: TheNDInterpolatorBasedoes not have any state mutation methods. Since they depend on QHull access, concurrent calls will be serialized ENH: spatial: serialize concurrent calls to QHull #20619UnivariateSpline/_BivariateSplineBase(and family): According to Mark fitpack sources asrecursive#16053 (comment), all FORTRAN code that is declared with therecursivekeyword is reentrant and thread safe, as long as there are not any variables declared with thesavekeyword. The port of FITPACK used by SciPy is not usingsave, and all routines are declared as recursive. Classes that use the FITPACK backend base classes (akaUnivariateSplineand_BivariateSplineBase) do not have any state-mutating public function different from the constructor, so they should be thread-safe in principle.