Skip to content

Conversation

@youknowone
Copy link
Member

@youknowone youknowone commented Dec 26, 2025

close #6506
fix #3965

Summary by CodeRabbit

  • New Features

    • Enhanced multiprocessing on Unix/macOS with improved semaphore-based synchronization and inspection.
  • Bug Fixes

    • Fixed buffer-size validation for struct pack/unpack to error when space is insufficient.
    • More robust semaphore error handling and timeout behavior on Unix/macOS.
  • Chores

    • Added two entries to the spell-check dictionary.
    • Updated POSIX shared-memory open argument handling.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 26, 2025

Caution

Review failed

The pull request is closed.

📝 Walkthrough

Walkthrough

Adds a Unix-focused SemLock implementation and sem_unlink to _multiprocessing, refactors shm_open to accept a FromArgs struct with default mode, tightens buffer bound checking in pystruct, and adds two words to the cspell dictionary (ismine, newsemlockobject).

Changes

Cohort / File(s) Summary
Dictionary
.cspell.dict/cpython.txt
Adds entries: ismine, newsemlockobject.
_multiprocessing (Unix SemLock)
crates/stdlib/src/multiprocessing.rs
Introduces Unix-specific SemLock and SemHandle: constructor, acquire/release, context-manager methods, recursive-mutex tracking, macOS vs other-Unix semaphore handling, sem_unlink, flags() exposure, and class constants (RECURSIVE_MUTEX, SEMAPHORE, SEM_VALUE_MAX).
Shared memory arg parsing
crates/stdlib/src/posixshmem.rs
Replaces shm_open(name, flags, mode, vm) with shm_open(args: ShmOpenArgs, vm), adds ShmOpenArgs: FromArgs and default mode handling; updates call sites and imports.
Struct buffer validation
crates/stdlib/src/pystruct.rs
Changes buffer bounds check from offset >= buffer_len to validate offset + needed > buffer_len, ensuring sufficient remaining space for required bytes.

Sequence Diagram(s)

sequenceDiagram
    participant App as Application
    participant SemLock as _multiprocessing.SemLock
    participant Handle as SemHandle
    participant Kernel as POSIX Kernel

    App->>SemLock: py_new(kind,value,maxvalue,name,unlink)
    SemLock->>Handle: create/open semaphore
    Handle->>Kernel: sem_open / sem_unlink + sem_open
    Kernel-->>Handle: semaphore handle
    Handle-->>SemLock: initialized

    rect rgb(220,240,220)
    Note over App,SemLock: Acquire flow
    App->>SemLock: acquire(blocking, timeout)
    SemLock->>SemLock: check owner / recursive count
    alt already owner
        SemLock->>SemLock: increment count
        SemLock-->>App: True
    else not owner
        SemLock->>Handle: sem_timedwait / sem_wait
        Handle->>Kernel: wait on semaphore
        Kernel-->>Handle: success / timeout / error
        Handle-->>SemLock: result
        SemLock->>SemLock: update state (owner/count)
        SemLock-->>App: True / False / raise
    end
    end

    rect rgb(240,220,220)
    Note over App,SemLock: Release flow
    App->>SemLock: release()
    SemLock->>SemLock: decrement count
    alt count > 0
        SemLock-->>App: still held
    else count == 0
        SemLock->>Handle: sem_post()
        Handle->>Kernel: post semaphore
        Kernel-->>Handle: ack
        Handle-->>SemLock: success
        SemLock-->>App: released
    end
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • impl multiprocessing SemLock #6542 — Implements the same SemLock functionality, sem_unlink, flags exposure, and includes identical .cspell additions; likely a direct duplicate or closely related implementation.

Poem

🐇
I hopped through code with tiny paws,
Named semaphores and fixed the laws,
I guard the counts and find the gaps,
I nibble bugs and clap my claps —
Hooray for locks and tidy claws!

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Out of Scope Changes check ⚠️ Warning Changes in .cspell.dict/cpython.txt, posixshmem.rs, and pystruct.rs appear tangential; the buffer validation change in pystruct.rs and ShmOpenArgs refactor are not directly tied to SemLock implementation objectives. Remove or justify unrelated changes in .cspell.dict/cpython.txt, posixshmem.rs, and pystruct.rs, or split them into separate PRs aligned with their respective objectives.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly summarizes the main change: implementing the multiprocessing SemLock component, which aligns with the core objective.
Linked Issues check ✅ Passed The PR implements core requirements from #6506 and #3965: SemLock class with constructor and methods, sem_unlink function, and Unix-specific synchronization, fulfilling completed tasks marked in the issue.
Docstring Coverage ✅ Passed Docstring coverage is 89.66% which is sufficient. The required threshold is 80.00%.

📜 Recent review details

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 00e4c92 and 3979249.

📒 Files selected for processing (1)
  • crates/stdlib/src/multiprocessing.rs

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@youknowone youknowone force-pushed the sem branch 7 times, most recently from 37aa27b to b354fba Compare January 2, 2026 04:15
Copilot AI and others added 5 commits January 2, 2026 19:44
Co-authored-by: youknowone <69878+youknowone@users.noreply.github.com>
Co-authored-by: youknowone <69878+youknowone@users.noreply.github.com>
Co-authored-by: youknowone <69878+youknowone@users.noreply.github.com>
Co-authored-by: youknowone <69878+youknowone@users.noreply.github.com>
@youknowone youknowone marked this pull request as ready for review January 2, 2026 14:05
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
crates/stdlib/src/multiprocessing.rs (1)

233-242: Verify the null check in Drop - SEM_FAILED is typically (sem_t*)-1, not null.

The Drop implementation checks !self.raw.is_null(), but libc::SEM_FAILED is usually ((sem_t*)-1) on POSIX systems, not null. This should still be safe since a valid semaphore pointer won't be null, but consider adding a comment clarifying this is guarding against default/uninitialized state rather than SEM_FAILED.

🔎 Suggested clarification
 impl Drop for SemHandle {
     fn drop(&mut self) {
+        // Guard against default/uninitialized state.
+        // Note: SEM_FAILED is (sem_t*)-1, not null, but valid handles are never null.
         if !self.raw.is_null() {
             // SEM_CLOSE(sem) sem_close(sem)
             unsafe {
                 libc::sem_close(self.raw);
             }
         }
     }
 }
📜 Review details

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 346519b and fb6418a.

⛔ Files ignored due to path filters (6)
  • Lib/test/_test_multiprocessing.py is excluded by !Lib/**
  • Lib/test/mp_fork_bomb.py is excluded by !Lib/**
  • Lib/test/mp_preload.py is excluded by !Lib/**
  • Lib/test/mp_preload_flush.py is excluded by !Lib/**
  • Lib/test/test_importlib/test_threaded_import.py is excluded by !Lib/**
  • Lib/test/test_logging.py is excluded by !Lib/**
📒 Files selected for processing (4)
  • .cspell.dict/cpython.txt
  • crates/stdlib/src/multiprocessing.rs
  • crates/stdlib/src/posixshmem.rs
  • crates/stdlib/src/pystruct.rs
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.rs: Follow the default rustfmt code style by running cargo fmt to format Rust code
Always run clippy to lint code (cargo clippy) before completing tasks and fix any warnings or lints introduced by changes
Follow Rust best practices for error handling and memory management
Use the macro system (pyclass, pymodule, pyfunction, etc.) when implementing Python functionality in Rust

Files:

  • crates/stdlib/src/pystruct.rs
  • crates/stdlib/src/posixshmem.rs
  • crates/stdlib/src/multiprocessing.rs
🧠 Learnings (2)
📚 Learning: 2025-06-27T14:47:28.810Z
Learnt from: moreal
Repo: RustPython/RustPython PR: 5847
File: vm/src/stdlib/stat.rs:547-567
Timestamp: 2025-06-27T14:47:28.810Z
Learning: In RustPython's stat module implementation, platform-specific constants like SF_SUPPORTED and SF_SYNTHETIC should be conditionally declared only for the platforms where they're available (e.g., macOS), following CPython's approach of optional declaration using #ifdef checks rather than providing fallback values for other platforms.

Applied to files:

  • crates/stdlib/src/multiprocessing.rs
📚 Learning: 2025-06-27T14:47:28.810Z
Learnt from: moreal
Repo: RustPython/RustPython PR: 5847
File: vm/src/stdlib/stat.rs:547-567
Timestamp: 2025-06-27T14:47:28.810Z
Learning: In RustPython's stat module implementation, platform-specific constants like SF_SUPPORTED and SF_SYNTHETIC should be conditionally declared only for the platforms where they're available (e.g., macOS), following CPython's approach of optional declaration rather than providing fallback values for other platforms.

Applied to files:

  • crates/stdlib/src/multiprocessing.rs
🧬 Code graph analysis (2)
crates/stdlib/src/pystruct.rs (1)
crates/vm/src/stdlib/ctypes/base.rs (1)
  • offset (1654-1656)
crates/stdlib/src/multiprocessing.rs (1)
crates/stdlib/src/faulthandler.rs (2)
  • current_thread_id (225-227)
  • current_thread_id (230-232)
🔇 Additional comments (11)
crates/stdlib/src/pystruct.rs (1)

74-86: LGTM! Correct boundary check fix.

The updated condition offset + needed > buffer_len properly validates that the buffer has sufficient space for the entire operation, not just that the offset is within bounds. This fixes cases where the offset was valid but insufficient space remained for the required bytes.

crates/stdlib/src/posixshmem.rs (1)

14-31: Clean refactoring to argument struct pattern.

The ShmOpenArgs struct with FromArgs is a good improvement for argument handling. The default mode 0o600 is a secure choice for shared memory permissions.

crates/stdlib/src/multiprocessing.rs (8)

62-140: LGTM! macOS polled fallback correctly implements CPython's approach.

The exponential backoff (1ms to 20ms cap), deadline checking, and signal handling match CPython's sem_timedwait_save implementation. Good handling of the three error cases.


286-289: Relaxed ordering in recursive mutex path may be correct but deserves scrutiny.

The fetch_add(1, Ordering::Relaxed) is used after the ismine! check which uses Acquire. However, since ismine! only reads, there's no synchronization with other threads that might be checking or modifying count. In practice, since only the owning thread (verified by ismine!) modifies count in this path, Relaxed should be safe. Consider adding a brief comment explaining this reasoning.


508-527: LGTM! Context manager methods correctly delegate to acquire/release.


529-555: LGTM! The _handle parameter is intentionally unused.

The rebuild logic correctly opens the semaphore by name (the handle value is platform-specific and not portable for pickling). Initial count/last_tid of 0 is correct for a freshly rebuilt lock.


557-565: LGTM! Good fork safety handling.

Resetting both count and last_tid after fork is appropriate. The comment correctly notes the last_tid reset is an addition over CPython.


606-630: Note: macOS _is_zero has inherent TOCTOU limitation.

The trywait/post pattern for checking zero on macOS has an inherent race: between the successful sem_trywait and the sem_post to restore the value, another thread could be unblocked by the post. This is a known limitation of the workaround and matches CPython's approach. The method's result should be considered advisory.


657-685: LGTM! Constructor with appropriate validation.

The kind and value validation is correct. The value as u32 cast is safe due to the preceding value >= 0 check.


729-754: LGTM! Helper functions are well-implemented.

The semaphore_name correctly handles POSIX naming conventions, os_error properly maps to specific exception types (FileExistsError, FileNotFoundError), and current_thread_id matches the pattern used elsewhere in the codebase.

.cspell.dict/cpython.txt (1)

33-44: LGTM! Dictionary entries support the new SemLock implementation.

Both ismine (matching CPython's ISMINE macro) and newsemlockobject (matching CPython's internal function) are correctly added and alphabetically ordered.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

📜 Review details

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fb6418a and 00e4c92.

⛔ Files ignored due to path filters (6)
  • Lib/test/_test_multiprocessing.py is excluded by !Lib/**
  • Lib/test/mp_fork_bomb.py is excluded by !Lib/**
  • Lib/test/mp_preload.py is excluded by !Lib/**
  • Lib/test/mp_preload_flush.py is excluded by !Lib/**
  • Lib/test/test_importlib/test_threaded_import.py is excluded by !Lib/**
  • Lib/test/test_logging.py is excluded by !Lib/**
📒 Files selected for processing (4)
  • .cspell.dict/cpython.txt
  • crates/stdlib/src/multiprocessing.rs
  • crates/stdlib/src/posixshmem.rs
  • crates/stdlib/src/pystruct.rs
🚧 Files skipped from review as they are similar to previous changes (3)
  • crates/stdlib/src/posixshmem.rs
  • crates/stdlib/src/pystruct.rs
  • .cspell.dict/cpython.txt
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.rs: Follow the default rustfmt code style by running cargo fmt to format Rust code
Always run clippy to lint code (cargo clippy) before completing tasks and fix any warnings or lints introduced by changes
Follow Rust best practices for error handling and memory management
Use the macro system (pyclass, pymodule, pyfunction, etc.) when implementing Python functionality in Rust

Files:

  • crates/stdlib/src/multiprocessing.rs
🧠 Learnings (2)
📚 Learning: 2025-06-27T14:47:28.810Z
Learnt from: moreal
Repo: RustPython/RustPython PR: 5847
File: vm/src/stdlib/stat.rs:547-567
Timestamp: 2025-06-27T14:47:28.810Z
Learning: In RustPython's stat module implementation, platform-specific constants like SF_SUPPORTED and SF_SYNTHETIC should be conditionally declared only for the platforms where they're available (e.g., macOS), following CPython's approach of optional declaration using #ifdef checks rather than providing fallback values for other platforms.

Applied to files:

  • crates/stdlib/src/multiprocessing.rs
📚 Learning: 2025-06-27T14:47:28.810Z
Learnt from: moreal
Repo: RustPython/RustPython PR: 5847
File: vm/src/stdlib/stat.rs:547-567
Timestamp: 2025-06-27T14:47:28.810Z
Learning: In RustPython's stat module implementation, platform-specific constants like SF_SUPPORTED and SF_SYNTHETIC should be conditionally declared only for the platforms where they're available (e.g., macOS), following CPython's approach of optional declaration rather than providing fallback values for other platforms.

Applied to files:

  • crates/stdlib/src/multiprocessing.rs
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
  • GitHub Check: Run rust tests (windows-latest)
  • GitHub Check: Run snippets and cpython tests (macos-latest)
  • GitHub Check: Run snippets and cpython tests (windows-latest)
  • GitHub Check: Check Rust code with clippy
  • GitHub Check: Run snippets and cpython tests on wasm-wasi
  • GitHub Check: Run snippets and cpython tests (ubuntu-latest)
  • GitHub Check: Run rust tests (ubuntu-latest)
  • GitHub Check: Run rust tests (macos-latest)
  • GitHub Check: Ensure compilation on various targets
  • GitHub Check: Run tests under miri
  • GitHub Check: Check the WASM package and demo
🔇 Additional comments (19)
crates/stdlib/src/multiprocessing.rs (19)

44-61: LGTM!

The module setup and imports are appropriate for a POSIX semaphore implementation. The use of atomic types for shared state is correct.


62-140: LGTM! macOS workaround for missing sem_timedwait.

The polled fallback implementation correctly handles the macOS limitation where sem_timedwait is unavailable. The exponential backoff (capped at 20ms), signal checking, and timeout handling align with CPython's approach.


142-152: LGTM!

Constants match CPython's multiprocessing module. The ismine! macro correctly uses Acquire ordering to read shared state.


154-178: LGTM!

Data structures are well-designed. The unsafe impl Send/Sync for SemHandle is correct since POSIX semaphores are thread-safe.


231-243: Drop implementation is correct.

The null check is sufficient since SEM_FAILED is never stored (errors are returned immediately at lines 200-202 and 218-221). The comment accurately explains this invariant.


245-266: LGTM!

Property getters are straightforward and correct.


330-424: Acquire logic correctly handles signals and timeouts.

The implementation properly:

  • Attempts fast acquisition with sem_trywait
  • Falls back to blocking wait with appropriate platform handling
  • Checks signals after EINTR
  • Uses correct memory ordering (Release) when updating count and last_tid

438-442: LGTM! Race condition resolved.

The code now correctly uses fetch_sub (line 440) for atomic decrement, addressing the race condition flagged in the previous review. The Release ordering ensures proper synchronization.


460-481: macOS maxvalue validation is limited to maxvalue==1.

The HAVE_BROKEN_SEM_GETVALUE workaround on macOS only validates the maxvalue==1 case. For higher maxvalue, releases are not validated. This is a known CPython limitation and is acceptable.


484-490: LGTM!

The sem_post call and atomic count decrement correctly release the semaphore.


493-512: LGTM!

Context manager implementation correctly delegates to acquire() and release().


514-539: LGTM!

The _rebuild classmethod correctly reconstructs a SemLock from pickled state, matching CPython's pattern. Initializing count and last_tid to zero is correct for a deserialized object.


541-611: LGTM! Introspection methods handle platform differences correctly.

The implementations appropriately handle macOS's HAVE_BROKEN_SEM_GETVALUE limitation:

  • _get_value() raises NotImplementedError on macOS
  • _is_zero() uses the sem_trywait + sem_post workaround on macOS

The _after_fork() method correctly resets acquisition state for child processes.


613-635: LGTM!

The extend_class method correctly exposes constants and computes SEM_VALUE_MAX using sysconf, with appropriate fallback to i32::MAX.


637-664: LGTM!

Constructor properly validates inputs and initializes the SemLock with correct initial state.


666-676: LGTM!

The sem_unlink function correctly wraps the POSIX sem_unlink call with appropriate error handling.


678-704: LGTM!

The flags function correctly exposes platform capabilities matching CPython's feature detection.


706-731: LGTM! Helper functions are correctly implemented.

The implementations match CPython's patterns:

  • semaphore_name ensures POSIX-compliant naming
  • os_error maps errno values to appropriate exception types
  • current_thread_id matches the pattern used in faulthandler.rs

Based on learnings, the platform-specific implementation approach is appropriate for RustPython.


44-732: Ensure code passes cargo fmt and cargo clippy checks before merging.

Per the repository's coding guidelines for Rust files, run cargo fmt to format the code and cargo clippy to lint it, addressing any warnings or violations introduced by these changes.

Comment on lines +310 to +328
// deadline calculation:
// long sec = (long) timeout;
// long nsec = (long) (1e9 * (timeout - sec) + 0.5);
// deadline.tv_sec = now.tv_sec + sec;
// deadline.tv_nsec = now.tv_usec * 1000 + nsec;
// deadline.tv_sec += (deadline.tv_nsec / 1000000000);
// deadline.tv_nsec %= 1000000000;
let sec = timeout as libc::c_long;
let nsec = (1e9 * (timeout - sec as f64) + 0.5) as libc::c_long;
let mut deadline = libc::timespec {
tv_sec: tv.tv_sec + sec as libc::time_t,
tv_nsec: (tv.tv_usec as libc::c_long * 1000 + nsec) as _,
};
deadline.tv_sec += (deadline.tv_nsec / 1_000_000_000) as libc::time_t;
deadline.tv_nsec %= 1_000_000_000;
Some(deadline)
} else {
None
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Consider adding overflow checks for extremely large timeout values in deadline calculation.

The deadline arithmetic (lines 317-324) could silently overflow with extreme timeout values. The cast timeout as libc::c_long and the addition tv.tv_sec + sec lack bounds checking. If timeout approaches or exceeds c_long::MAX, the cast will wrap, producing incorrect deadline values. While this behavior mirrors CPython's implementation, RustPython should either validate input ranges before the cast, use saturating arithmetic (saturating_add, saturating_cast), or document this known limitation explicitly.

Consider adding a check like: if timeout > (c_long::MAX as f64) { ... handle error ... }

@youknowone youknowone merged commit 546d35b into RustPython:main Jan 2, 2026
12 of 13 checks passed
@youknowone youknowone deleted the sem branch January 2, 2026 15:14
@coderabbitai coderabbitai bot mentioned this pull request Jan 16, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

_multiprocessing_SemLock

1 participant