Skip to content

Semaphore gets released too often #4147

@lfdversluis

Description

@lfdversluis

To avoid generating too much NFS traffic, I am using a semaphore like this to copy data from NFS to a local file storage to read from later:

cluster = SLURMCluster(cores=16, memory="64 GB", #processes=1,
                       local_directory="/var/scratch/me/dask_scheduler_spill",
                       interface='ib0', walltime='24:00:00')

# Create a client to submit to.
client = Client(cluster)

# Allocate 10 nodes in the cluster
cluster.scale(10)

print("Waiting for workers")
# Wait until they are ready
client.wait_for_workers(10)
print("Workers are ready!")

sem = Semaphore(max_leases=4, name='data_copy')

def copy_data(location, target, sem):
    with sem:
         copy_tree(location, target, update=True)

client.run(copy_data, root_dataset, local_dataset, sem)
sem.close()  # Clean up the semaphore at the scheduler side

Now if I run it I get the error that

site-packages/distributed/semaphore.py", line 472, in release
    raise RuntimeError("Released too often")
RuntimeError: Released too often

I tried debugging, I found out that setting processes=1, in the construction of SLURMCluster does **not ** fix this. So it looks like a threading issue to me. #4057 should've fixed that, but clearly it didn't? Creating the semaphore with sem = Semaphore(max_leases=4, name='data_copy', client=client, register=True) also did not work.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions