Skip to content

[FEA] distributed_helper used to allow mem_order="acquire" in spin_lock_wait, but in 4.3.1 pip package only "relaxed" is exposed via #2845

@aleozlx

Description

@aleozlx

Which component requires the feature?

CuTe DSL

Feature Request

Is your feature request related to a problem? Please describe.

distributed_helper used to allow mem_order="acquire" in spin_lock_wait, but in 4.3.1 pip package only "relaxed" is exposed via spin_lock_atom_cas_relaxed_wait. wish there is an "acquire" version exposed somehow as well

Describe the solution you'd like
one possibility : spin_lock_atom_cas_acquire_wait

Describe alternatives you've considered
keep mem_order string arg
may be some of our source code is ported from examples? i'm not sure, need to check ...

Additional context
used by https://github.com/aleozlx/flashinfer/blob/442dec9bea569f53e01b799a2e0328c2ea30bbca/flashinfer/cute_dsl/gemm_allreduce_two_shot.py#L1399-L1403
https://github.com/NVIDIA/cutlass/blob/v4.3.1/python/CuTeDSL/cutlass/utils/distributed_helpers.py#L136

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions