impl multiprocessing SemLock #6542

youknowone · 2025-12-26T13:51:24Z

close #6506
fix #3965

Summary by CodeRabbit

New Features
- Enhanced multiprocessing on Unix/macOS with improved semaphore-based synchronization and inspection.
Bug Fixes
- Fixed buffer-size validation for struct pack/unpack to error when space is insufficient.
- More robust semaphore error handling and timeout behavior on Unix/macOS.
Chores
- Added two entries to the spell-check dictionary.
- Updated POSIX shared-memory open argument handling.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

coderabbitai · 2025-12-26T13:51:29Z

Caution

Review failed

The pull request is closed.

📝 Walkthrough

Walkthrough

Adds a Unix-focused SemLock implementation and sem_unlink to _multiprocessing, refactors shm_open to accept a FromArgs struct with default mode, tightens buffer bound checking in pystruct, and adds two words to the cspell dictionary (ismine, newsemlockobject).

Changes

Cohort / File(s)	Summary
Dictionary `.cspell.dict/cpython.txt`	Adds entries: `ismine`, `newsemlockobject`.
_multiprocessing (Unix SemLock) `crates/stdlib/src/multiprocessing.rs`	Introduces Unix-specific `SemLock` and `SemHandle`: constructor, `acquire`/`release`, context-manager methods, recursive-mutex tracking, macOS vs other-Unix semaphore handling, `sem_unlink`, `flags()` exposure, and class constants (`RECURSIVE_MUTEX`, `SEMAPHORE`, `SEM_VALUE_MAX`).
Shared memory arg parsing `crates/stdlib/src/posixshmem.rs`	Replaces `shm_open(name, flags, mode, vm)` with `shm_open(args: ShmOpenArgs, vm)`, adds `ShmOpenArgs: FromArgs` and default mode handling; updates call sites and imports.
Struct buffer validation `crates/stdlib/src/pystruct.rs`	Changes buffer bounds check from `offset >= buffer_len` to validate `offset + needed > buffer_len`, ensuring sufficient remaining space for required bytes.

Sequence Diagram(s)

sequenceDiagram
    participant App as Application
    participant SemLock as _multiprocessing.SemLock
    participant Handle as SemHandle
    participant Kernel as POSIX Kernel

    App->>SemLock: py_new(kind,value,maxvalue,name,unlink)
    SemLock->>Handle: create/open semaphore
    Handle->>Kernel: sem_open / sem_unlink + sem_open
    Kernel-->>Handle: semaphore handle
    Handle-->>SemLock: initialized

    rect rgb(220,240,220)
    Note over App,SemLock: Acquire flow
    App->>SemLock: acquire(blocking, timeout)
    SemLock->>SemLock: check owner / recursive count
    alt already owner
        SemLock->>SemLock: increment count
        SemLock-->>App: True
    else not owner
        SemLock->>Handle: sem_timedwait / sem_wait
        Handle->>Kernel: wait on semaphore
        Kernel-->>Handle: success / timeout / error
        Handle-->>SemLock: result
        SemLock->>SemLock: update state (owner/count)
        SemLock-->>App: True / False / raise
    end
    end

    rect rgb(240,220,220)
    Note over App,SemLock: Release flow
    App->>SemLock: release()
    SemLock->>SemLock: decrement count
    alt count > 0
        SemLock-->>App: still held
    else count == 0
        SemLock->>Handle: sem_post()
        Handle->>Kernel: post semaphore
        Kernel-->>Handle: ack
        Handle-->>SemLock: success
        SemLock-->>App: released
    end
    end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

impl multiprocessing SemLock #6542 — Implements the same SemLock functionality, sem_unlink, flags exposure, and includes identical .cspell additions; likely a direct duplicate or closely related implementation.

Poem

🐇
I hopped through code with tiny paws,
Named semaphores and fixed the laws,
I guard the counts and find the gaps,
I nibble bugs and clap my claps —
Hooray for locks and tidy claws!

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Out of Scope Changes check	⚠️ Warning	Changes in .cspell.dict/cpython.txt, posixshmem.rs, and pystruct.rs appear tangential; the buffer validation change in pystruct.rs and ShmOpenArgs refactor are not directly tied to SemLock implementation objectives.	Remove or justify unrelated changes in .cspell.dict/cpython.txt, posixshmem.rs, and pystruct.rs, or split them into separate PRs aligned with their respective objectives.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly summarizes the main change: implementing the multiprocessing SemLock component, which aligns with the core objective.
Linked Issues check	✅ Passed	The PR implements core requirements from #6506 and #3965: SemLock class with constructor and methods, sem_unlink function, and Unix-specific synchronization, fulfilling completed tasks marked in the issue.
Docstring Coverage	✅ Passed	Docstring coverage is 89.66% which is sufficient. The required threshold is 80.00%.

📜 Recent review details

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 00e4c92 and 3979249.

📒 Files selected for processing (1)

crates/stdlib/src/multiprocessing.rs

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Co-authored-by: youknowone <69878+youknowone@users.noreply.github.com>

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

crates/stdlib/src/multiprocessing.rs (1)
233-242: Verify the null check in Drop - SEM_FAILED is typically (sem_t*)-1, not null.

The Drop implementation checks !self.raw.is_null(), but libc::SEM_FAILED is usually ((sem_t*)-1) on POSIX systems, not null. This should still be safe since a valid semaphore pointer won't be null, but consider adding a comment clarifying this is guarding against default/uninitialized state rather than SEM_FAILED.
🔎 Suggested clarification
 impl Drop for SemHandle {
     fn drop(&mut self) {
+        // Guard against default/uninitialized state.
+        // Note: SEM_FAILED is (sem_t*)-1, not null, but valid handles are never null.
         if !self.raw.is_null() {
             // SEM_CLOSE(sem) sem_close(sem)
             unsafe {
                 libc::sem_close(self.raw);
             }
         }
     }
 }

📜 Review details

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 346519b and fb6418a.

⛔ Files ignored due to path filters (6)

Lib/test/_test_multiprocessing.py is excluded by !Lib/**
Lib/test/mp_fork_bomb.py is excluded by !Lib/**
Lib/test/mp_preload.py is excluded by !Lib/**
Lib/test/mp_preload_flush.py is excluded by !Lib/**
Lib/test/test_importlib/test_threaded_import.py is excluded by !Lib/**
Lib/test/test_logging.py is excluded by !Lib/**

📒 Files selected for processing (4)

.cspell.dict/cpython.txt
crates/stdlib/src/multiprocessing.rs
crates/stdlib/src/posixshmem.rs
crates/stdlib/src/pystruct.rs

🧰 Additional context used

📓 Path-based instructions (1)

**/*.rs

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.rs: Follow the default rustfmt code style by running cargo fmt to format Rust code
Always run clippy to lint code (cargo clippy) before completing tasks and fix any warnings or lints introduced by changes
Follow Rust best practices for error handling and memory management
Use the macro system (pyclass, pymodule, pyfunction, etc.) when implementing Python functionality in Rust

Files:

crates/stdlib/src/pystruct.rs
crates/stdlib/src/posixshmem.rs
crates/stdlib/src/multiprocessing.rs

🧠 Learnings (2)

📚 Learning: 2025-06-27T14:47:28.810Z

Learnt from: moreal
Repo: RustPython/RustPython PR: 5847
File: vm/src/stdlib/stat.rs:547-567
Timestamp: 2025-06-27T14:47:28.810Z
Learning: In RustPython's stat module implementation, platform-specific constants like SF_SUPPORTED and SF_SYNTHETIC should be conditionally declared only for the platforms where they're available (e.g., macOS), following CPython's approach of optional declaration using #ifdef checks rather than providing fallback values for other platforms.

Applied to files:

crates/stdlib/src/multiprocessing.rs

📚 Learning: 2025-06-27T14:47:28.810Z

Learnt from: moreal
Repo: RustPython/RustPython PR: 5847
File: vm/src/stdlib/stat.rs:547-567
Timestamp: 2025-06-27T14:47:28.810Z
Learning: In RustPython's stat module implementation, platform-specific constants like SF_SUPPORTED and SF_SYNTHETIC should be conditionally declared only for the platforms where they're available (e.g., macOS), following CPython's approach of optional declaration rather than providing fallback values for other platforms.

Applied to files:

crates/stdlib/src/multiprocessing.rs

🧬 Code graph analysis (2)

crates/stdlib/src/pystruct.rs (1)

crates/vm/src/stdlib/ctypes/base.rs (1)

offset (1654-1656)

crates/stdlib/src/multiprocessing.rs (1)

crates/stdlib/src/faulthandler.rs (2)

current_thread_id (225-227)

current_thread_id (230-232)

🔇 Additional comments (11)

crates/stdlib/src/pystruct.rs (1)

74-86: LGTM! Correct boundary check fix.

The updated condition offset + needed > buffer_len properly validates that the buffer has sufficient space for the entire operation, not just that the offset is within bounds. This fixes cases where the offset was valid but insufficient space remained for the required bytes.

crates/stdlib/src/posixshmem.rs (1)

14-31: Clean refactoring to argument struct pattern.

The ShmOpenArgs struct with FromArgs is a good improvement for argument handling. The default mode 0o600 is a secure choice for shared memory permissions.

crates/stdlib/src/multiprocessing.rs (8)

62-140: LGTM! macOS polled fallback correctly implements CPython's approach.

The exponential backoff (1ms to 20ms cap), deadline checking, and signal handling match CPython's sem_timedwait_save implementation. Good handling of the three error cases.

286-289: Relaxed ordering in recursive mutex path may be correct but deserves scrutiny.

The fetch_add(1, Ordering::Relaxed) is used after the ismine! check which uses Acquire. However, since ismine! only reads, there's no synchronization with other threads that might be checking or modifying count. In practice, since only the owning thread (verified by ismine!) modifies count in this path, Relaxed should be safe. Consider adding a brief comment explaining this reasoning.

508-527: LGTM! Context manager methods correctly delegate to acquire/release.

529-555: LGTM! The _handle parameter is intentionally unused.

The rebuild logic correctly opens the semaphore by name (the handle value is platform-specific and not portable for pickling). Initial count/last_tid of 0 is correct for a freshly rebuilt lock.

557-565: LGTM! Good fork safety handling.

Resetting both count and last_tid after fork is appropriate. The comment correctly notes the last_tid reset is an addition over CPython.

606-630: Note: macOS _is_zero has inherent TOCTOU limitation.

The trywait/post pattern for checking zero on macOS has an inherent race: between the successful sem_trywait and the sem_post to restore the value, another thread could be unblocked by the post. This is a known limitation of the workaround and matches CPython's approach. The method's result should be considered advisory.

657-685: LGTM! Constructor with appropriate validation.

The kind and value validation is correct. The value as u32 cast is safe due to the preceding value >= 0 check.

729-754: LGTM! Helper functions are well-implemented.

The semaphore_name correctly handles POSIX naming conventions, os_error properly maps to specific exception types (FileExistsError, FileNotFoundError), and current_thread_id matches the pattern used elsewhere in the codebase.

.cspell.dict/cpython.txt (1)

33-44: LGTM! Dictionary entries support the new SemLock implementation.

Both ismine (matching CPython's ISMINE macro) and newsemlockobject (matching CPython's internal function) are correctly added and alphabetically ordered.

crates/stdlib/src/multiprocessing.rs

coderabbitai

Actionable comments posted: 2

📜 Review details

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fb6418a and 00e4c92.

⛔ Files ignored due to path filters (6)

Lib/test/_test_multiprocessing.py is excluded by !Lib/**
Lib/test/mp_fork_bomb.py is excluded by !Lib/**
Lib/test/mp_preload.py is excluded by !Lib/**
Lib/test/mp_preload_flush.py is excluded by !Lib/**
Lib/test/test_importlib/test_threaded_import.py is excluded by !Lib/**
Lib/test/test_logging.py is excluded by !Lib/**

📒 Files selected for processing (4)

.cspell.dict/cpython.txt
crates/stdlib/src/multiprocessing.rs
crates/stdlib/src/posixshmem.rs
crates/stdlib/src/pystruct.rs

🚧 Files skipped from review as they are similar to previous changes (3)

crates/stdlib/src/posixshmem.rs
crates/stdlib/src/pystruct.rs
.cspell.dict/cpython.txt

🧰 Additional context used

📓 Path-based instructions (1)

**/*.rs

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.rs: Follow the default rustfmt code style by running cargo fmt to format Rust code
Always run clippy to lint code (cargo clippy) before completing tasks and fix any warnings or lints introduced by changes
Follow Rust best practices for error handling and memory management
Use the macro system (pyclass, pymodule, pyfunction, etc.) when implementing Python functionality in Rust

Files:

crates/stdlib/src/multiprocessing.rs

🧠 Learnings (2)

📚 Learning: 2025-06-27T14:47:28.810Z

Learnt from: moreal
Repo: RustPython/RustPython PR: 5847
File: vm/src/stdlib/stat.rs:547-567
Timestamp: 2025-06-27T14:47:28.810Z
Learning: In RustPython's stat module implementation, platform-specific constants like SF_SUPPORTED and SF_SYNTHETIC should be conditionally declared only for the platforms where they're available (e.g., macOS), following CPython's approach of optional declaration using #ifdef checks rather than providing fallback values for other platforms.

Applied to files:

crates/stdlib/src/multiprocessing.rs

📚 Learning: 2025-06-27T14:47:28.810Z

Learnt from: moreal
Repo: RustPython/RustPython PR: 5847
File: vm/src/stdlib/stat.rs:547-567
Timestamp: 2025-06-27T14:47:28.810Z
Learning: In RustPython's stat module implementation, platform-specific constants like SF_SUPPORTED and SF_SYNTHETIC should be conditionally declared only for the platforms where they're available (e.g., macOS), following CPython's approach of optional declaration rather than providing fallback values for other platforms.

Applied to files:

crates/stdlib/src/multiprocessing.rs

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)

GitHub Check: Run rust tests (windows-latest)
GitHub Check: Run snippets and cpython tests (macos-latest)
GitHub Check: Run snippets and cpython tests (windows-latest)
GitHub Check: Check Rust code with clippy
GitHub Check: Run snippets and cpython tests on wasm-wasi
GitHub Check: Run snippets and cpython tests (ubuntu-latest)
GitHub Check: Run rust tests (ubuntu-latest)
GitHub Check: Run rust tests (macos-latest)
GitHub Check: Ensure compilation on various targets
GitHub Check: Run tests under miri
GitHub Check: Check the WASM package and demo

🔇 Additional comments (19)

crates/stdlib/src/multiprocessing.rs (19)

44-61: LGTM!

The module setup and imports are appropriate for a POSIX semaphore implementation. The use of atomic types for shared state is correct.

62-140: LGTM! macOS workaround for missing sem_timedwait.

The polled fallback implementation correctly handles the macOS limitation where sem_timedwait is unavailable. The exponential backoff (capped at 20ms), signal checking, and timeout handling align with CPython's approach.

142-152: LGTM!

Constants match CPython's multiprocessing module. The ismine! macro correctly uses Acquire ordering to read shared state.

154-178: LGTM!

Data structures are well-designed. The unsafe impl Send/Sync for SemHandle is correct since POSIX semaphores are thread-safe.

231-243: Drop implementation is correct.

The null check is sufficient since SEM_FAILED is never stored (errors are returned immediately at lines 200-202 and 218-221). The comment accurately explains this invariant.

245-266: LGTM!

Property getters are straightforward and correct.

330-424: Acquire logic correctly handles signals and timeouts.

The implementation properly:

Attempts fast acquisition with sem_trywait

Falls back to blocking wait with appropriate platform handling

Checks signals after EINTR

Uses correct memory ordering (Release) when updating count and last_tid

438-442: LGTM! Race condition resolved.

The code now correctly uses fetch_sub (line 440) for atomic decrement, addressing the race condition flagged in the previous review. The Release ordering ensures proper synchronization.

460-481: macOS maxvalue validation is limited to maxvalue==1.

The HAVE_BROKEN_SEM_GETVALUE workaround on macOS only validates the maxvalue==1 case. For higher maxvalue, releases are not validated. This is a known CPython limitation and is acceptable.

484-490: LGTM!

The sem_post call and atomic count decrement correctly release the semaphore.

493-512: LGTM!

Context manager implementation correctly delegates to acquire() and release().

514-539: LGTM!

The _rebuild classmethod correctly reconstructs a SemLock from pickled state, matching CPython's pattern. Initializing count and last_tid to zero is correct for a deserialized object.

541-611: LGTM! Introspection methods handle platform differences correctly.

The implementations appropriately handle macOS's HAVE_BROKEN_SEM_GETVALUE limitation:

_get_value() raises NotImplementedError on macOS

_is_zero() uses the sem_trywait + sem_post workaround on macOS

The _after_fork() method correctly resets acquisition state for child processes.

613-635: LGTM!

The extend_class method correctly exposes constants and computes SEM_VALUE_MAX using sysconf, with appropriate fallback to i32::MAX.

637-664: LGTM!

Constructor properly validates inputs and initializes the SemLock with correct initial state.

666-676: LGTM!

The sem_unlink function correctly wraps the POSIX sem_unlink call with appropriate error handling.

678-704: LGTM!

The flags function correctly exposes platform capabilities matching CPython's feature detection.

706-731: LGTM! Helper functions are correctly implemented.

The implementations match CPython's patterns:

semaphore_name ensures POSIX-compliant naming

os_error maps errno values to appropriate exception types

current_thread_id matches the pattern used in faulthandler.rs

Based on learnings, the platform-specific implementation approach is appropriate for RustPython.

44-732: Ensure code passes cargo fmt and cargo clippy checks before merging.

Per the repository's coding guidelines for Rust files, run cargo fmt to format the code and cargo clippy to lint it, addressing any warnings or violations introduced by these changes.

crates/stdlib/src/multiprocessing.rs

coderabbitai · 2026-01-02T14:24:13Z

crates/stdlib/src/multiprocessing.rs

+                // deadline calculation:
+                // long sec = (long) timeout;
+                // long nsec = (long) (1e9 * (timeout - sec) + 0.5);
+                // deadline.tv_sec = now.tv_sec + sec;
+                // deadline.tv_nsec = now.tv_usec * 1000 + nsec;
+                // deadline.tv_sec += (deadline.tv_nsec / 1000000000);
+                // deadline.tv_nsec %= 1000000000;
+                let sec = timeout as libc::c_long;
+                let nsec = (1e9 * (timeout - sec as f64) + 0.5) as libc::c_long;
+                let mut deadline = libc::timespec {
+                    tv_sec: tv.tv_sec + sec as libc::time_t,
+                    tv_nsec: (tv.tv_usec as libc::c_long * 1000 + nsec) as _,
+                };
+                deadline.tv_sec += (deadline.tv_nsec / 1_000_000_000) as libc::time_t;
+                deadline.tv_nsec %= 1_000_000_000;
+                Some(deadline)
+            } else {
+                None
+            };


⚠️ Potential issue | 🟡 Minor

Consider adding overflow checks for extremely large timeout values in deadline calculation.

The deadline arithmetic (lines 317-324) could silently overflow with extreme timeout values. The cast timeout as libc::c_long and the addition tv.tv_sec + sec lack bounds checking. If timeout approaches or exceeds c_long::MAX, the cast will wrap, producing incorrect deadline values. While this behavior mirrors CPython's implementation, RustPython should either validate input ranges before the cast, use saturating arithmetic (saturating_add, saturating_cast), or document this known limitation explicitly.

Consider adding a check like: if timeout > (c_long::MAX as f64) { ... handle error ... }

youknowone force-pushed the sem branch 7 times, most recently from 37aa27b to b354fba Compare January 2, 2026 04:15

Copilot AI and others added 5 commits January 2, 2026 19:44

Initial plan

f11e2e5

Changes before error encountered

cca9889

Co-authored-by: youknowone <69878+youknowone@users.noreply.github.com>

Implement Unix SemLock and expose sem_unlink

4372866

Co-authored-by: youknowone <69878+youknowone@users.noreply.github.com>

Fix missing sem_* symbols on libc targets

99d1350

Co-authored-by: youknowone <69878+youknowone@users.noreply.github.com>

Fix SemLock constant interning to avoid panic

e5e3427

Co-authored-by: youknowone <69878+youknowone@users.noreply.github.com>

youknowone force-pushed the sem branch from b354fba to fb6418a Compare January 2, 2026 10:44

youknowone marked this pull request as ready for review January 2, 2026 14:05

coderabbitai bot reviewed Jan 2, 2026

View reviewed changes

crates/stdlib/src/multiprocessing.rs Outdated Show resolved Hide resolved

youknowone added 3 commits January 2, 2026 23:12

impl SemLock

56a7f07

Update multiprocessing tests from 3.13.11

70ff68a

mark tests

efdf8a7

youknowone force-pushed the sem branch from fb6418a to 00e4c92 Compare January 2, 2026 14:17

coderabbitai bot reviewed Jan 2, 2026

View reviewed changes

apply review

3979249

youknowone force-pushed the sem branch from 00e4c92 to 3979249 Compare January 2, 2026 15:14

youknowone merged commit 546d35b into RustPython:main Jan 2, 2026
12 of 13 checks passed

youknowone deleted the sem branch January 2, 2026 15:14

coderabbitai bot mentioned this pull request Jan 16, 2026

reject SemLock reduce #6738

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

impl multiprocessing SemLock #6542

impl multiprocessing SemLock #6542

Uh oh!

youknowone commented Dec 26, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Dec 26, 2025 •

edited

Loading

Review failed

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

coderabbitai bot Jan 2, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

impl multiprocessing SemLock #6542

impl multiprocessing SemLock #6542

Uh oh!

Conversation

youknowone commented Dec 26, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Dec 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot Jan 2, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

youknowone commented Dec 26, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Dec 26, 2025 •

edited

Loading