Spin-then-park receive, free-threaded Python compatibility.#6
Merged
matajoh merged 1 commit intomicrosoft:mainfrom Apr 1, 2026
Merged
Spin-then-park receive, free-threaded Python compatibility.#6matajoh merged 1 commit intomicrosoft:mainfrom
matajoh merged 1 commit intomicrosoft:mainfrom
Conversation
Improvements: - Added CownCapsule.disown() — abandons a cown's value without serializing it and resets ownership to NO_OWNER. Used during worker cleanup to safely discard orphan cowns before the owning interpreter is destroyed, preventing dangling Python object references. - Rewrote receive to use a two-phase spin-then-park strategy for single-tag untimed receives. Phase 1 spins for BOC_SPIN_COUNT iterations; Phase 2 parks the thread on a per-queue condvar, eliminating busy-wait CPU burn. Timed receives and multi-tag receives use spin-then-backoff with exponential sleep (1 µs → 1 ms cap). - Added platform-abstracted condvar primitives (BOCParkMutex / BOCParkCond) with implementations for Windows (SRWLOCK / CONDITION_VARIABLE), macOS (pthreads), and Linux (C11 threads). - Each BOCQueue now carries a waiters counter, park_mutex, and park_cond. Producers signal parked receivers after enqueue; drain and set_tags broadcast to wake all parked threads. - Replaced the fixed thrd_sleep in send with a sched_yield / SwitchToThread, reducing send-side latency. - Refactored the monolithic _core_receive into receive_single_tag and receive_multi_tag, each with its own backoff/parking logic. - Moved the BOC_QUEUE_DISABLED check earlier in get_queue_for_tag so callers skip disabled queues instead of returning NULL after tag resolution. - Added Windows-compatible atomic_load_explicit / atomic_fetch_add_explicit / atomic_fetch_sub_explicit macros using InterlockedExchangeAdd64. - Declared Py_mod_gil = Py_MOD_GIL_NOT_USED in both _core and _math C extensions so that importing bocpy on a free-threaded Python build (3.13t+) does not re-enable the GIL. - Replaced PyDict_GetItem (borrowed reference) with PyDict_GetItemRef (strong reference) in BOCRecycleQueue_recycle on Python 3.13+, improving forward-compatibility with free-threaded builds. Bug Fixes: - Fixed a deadlock when the same cown is passed multiple times to @when (e.g. @when(c, c)). Duplicate requests for the same cown caused the MCS-queue-based two-phase locking to spin-wait on itself. Requests are now deduplicated by target cown in Behavior.__init__, with compensating resolve_one calls to maintain the behavior count invariant. Tests: - TestLostWakeStress: single-producer random delays, bursty producer, and repeated single-message wake to detect lost-wake races. - TestMultiTagBackoff: multi-tag receive correctness — second-tag hit, delayed arrival, per-tag FIFO ordering, timeout, and interleaved producers. - TestTimeoutAccuracy: lower-bound / upper-bound wall-clock checks and zero-timeout immediacy. - Added tests for duplicate cowns in @when: same cown twice, thrice, non-adjacent duplicates, duplicates within a group, and mutation aliasing semantics. CI: - Added a free-threaded CI job that tests against Python 3.13t and 3.14t on Linux, with explicit assertions that the GIL remains disabled after import. Signed-off-by: Matthew A Johnson <matjoh@microsoft.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Spin-then-park receive; free-threaded Python compatibility.
Improvements
CownCapsule.disown()— abandons a cown's value withoutserializing it and resets ownership to
NO_OWNER. Used during workercleanup to safely discard orphan cowns before the owning interpreter
is destroyed, preventing dangling Python object references.
receiveto use a two-phase spin-then-park strategy forsingle-tag untimed receives. Phase 1 spins for
BOC_SPIN_COUNTiterations; Phase 2 parks the thread on a per-queue condvar, eliminating
busy-wait CPU burn. Timed receives and multi-tag receives use
spin-then-backoff with exponential sleep (1 µs → 1 ms cap).
BOCParkMutex/BOCParkCond) with implementations for Windows (SRWLOCK /CONDITION_VARIABLE), macOS (pthreads), and Linux (C11 threads).
BOCQueuenow carries awaiterscounter,park_mutex, andpark_cond. Producers signal parked receivers after enqueue;drainandset_tagsbroadcast to wake all parked threads.thrd_sleepinsendwith asched_yield/SwitchToThread, reducing send-side latency._core_receiveintoreceive_single_tagand
receive_multi_tag, each with its own backoff/parking logic.BOC_QUEUE_DISABLEDcheck earlier inget_queue_for_tagso callers skip disabled queues instead of returning NULL after
tag resolution.
atomic_load_explicit/atomic_fetch_add_explicit/atomic_fetch_sub_explicitmacrosusing
InterlockedExchangeAdd64.Py_mod_gil = Py_MOD_GIL_NOT_USEDin both_coreand_mathC extensions so that importing bocpy on a free-threadedPython build (3.13t+) does not re-enable the GIL.
PyDict_GetItem(borrowed reference) withPyDict_GetItemRef(strong reference) inBOCRecycleQueue_recycleon Python 3.13+, improving forward-compatibility with free-threaded
builds.
Bug Fixes
@when(e.g.
@when(c, c)). Duplicate requests for the same cown caused theMCS-queue-based two-phase locking to spin-wait on itself. Requests are
now deduplicated by target cown in
Behavior.__init__, withcompensating
resolve_onecalls to maintain the behavior countinvariant.
Tests
TestLostWakeStress: single-producer random delays, bursty producer,and repeated single-message wake to detect lost-wake races.
TestMultiTagBackoff: multi-tag receive correctness — second-tag hit,delayed arrival, per-tag FIFO ordering, timeout, and interleaved
producers.
TestTimeoutAccuracy: lower-bound / upper-bound wall-clock checks andzero-timeout immediacy.
@when: same cown twice, thrice,non-adjacent duplicates, duplicates within a group, and mutation
aliasing semantics.
CI
free-threadedCI job that tests against Python 3.13t and3.14t on Linux, with explicit assertions that the GIL remains disabled
after import.