Skip to content

Closing a client sets dask.config 'shuffle' and 'scheduler' to None #2473

@gsakkis

Description

@gsakkis
import dask
import dask.dataframe as dd
import distributed
import pandas as pd
import numpy as np


def set_index(msg=''):
    print("{} shuffle: {}".format(msg, dask.config.get('shuffle', 'MISSING')))
    print("{} scheduler: {}".format(msg, dask.config.get('scheduler', 'MISSING')))
    ddf.set_index('A').compute()


if __name__ == '__main__':
    df = pd.DataFrame({'A': np.random.rand(100)})
    ddf = dd.from_pandas(df, npartitions=2)
    set_index("before client")
    with distributed.Client():
        set_index("inside client")
    set_index("after client")
$ python tmp/dask/shuffle_none.py 
before client shuffle: MISSING
before client scheduler: MISSING
inside client shuffle: tasks
inside client scheduler: dask.distributed
after client shuffle: None
after client scheduler: None
Traceback (most recent call last):
...
NotImplementedError: Unknown shuffle method None

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething is brokengood first issueClearly described and easy to accomplish. Good for beginners to the project.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions