[WIP] Use of new dask CLI : job submission by dotsdl · Pull Request #6738 · dask/distributed

dotsdl · 2022-07-17T17:16:17Z

Implementation of dask job submit. This is intended to consume a python script as:

dask job submit <cluster-name> <script-path>

There are potentially two different usage patterns we are anticipating:

the script features use of a dask collection or builds delayed objects and calls <object>.compute(); this should use the dask cluster to execute all .compute calls.
the script features no dask-isms at all, but is intended to be a single worker job as if the script were just a function submitted via client.submit.

It may make sense for these two patterns to be served by two different subcommands. Exploring that here to settle on an approach.

Tests added / passed
Passes pre-commit run --all-files

Initial commit for structure and basic implementation of `dask job submit`

GPUtester · 2022-07-17T17:16:19Z

Can one of the admins verify this patch?

github-actions · 2022-07-17T17:59:33Z

Unit Test Results

See test report for an extended history of previous test failures. This is useful for diagnosing flaky tests.

      15 files ±0       15 suites ±0 6h 7m 17s ⏱️ - 18m 37s
  2 977 tests ±0   2 888 ✔️ +1     87 💤 - 1 2 ❌ ±0
22 072 runs ±0 21 035 ✔️ +1 1 035 💤 - 1 2 ❌ ±0

For more details on these failures, see this check.

Results for commit b854dc9. ± Comparison against base commit 930d3dc.

mrocklin · 2022-07-17T19:29:45Z

distributed/cli/dask_job.py

+    def runscript():
+        # TODO: use a tempfile
+        filepath = '/tmp/tmpfile'
+        with open(filepath, 'w') as f:
+            f.write(script)
+
+        st = os.stat(filepath)
+        os.chmod(filepath, st.st_mode | stat.S_IEXEC | stat.S_IREAD)
+
+        # TODO: need to set DASK_SCHEDULER_ADDRESS env variable in call to execute
+        # for our script; should be the address of the scheduler
+        # might need to use another approach in subprocess for this
+        return subprocess.check_output(filepath)


I recommend moving this out to a top-level private function so that serialization is easier.

client.run_on_scheduler(_run_script, script)

mrocklin · 2022-07-17T19:30:17Z

distributed/cli/dask_job.py

+
+@job.command()
+def gather():
+    ...


I think that we should drop this function. I'm not sure what it would do.

mrocklin · 2022-07-17T19:36:08Z

I think that we need to think a little bit about how this would be used. When submitting a script how does that script get access to the scheduler? Naively I might think that the following should work:

from dask.distributed import Client

client = Client()  # connects to the current Dask scheduler

df = dask.dataframe.read_parquet(...)
df....

Great. If this is the kind of workflow that we're thinking of then we'll need to address a few issues:

Make sure that we have the right address of the scheduler, and that we can connect to it reliably (for example, what happens if the scheduler is under TLS security?)
Make sure that our script can be a normal blocking script and not block the event loop (run_on_scheduler might not work?)

If this isn't the kind of workflow that we're thinking of then we should figure out what that workflow is and make sure that it works well.

gjoseph92 · 2022-07-18T13:15:26Z

Like @mrocklin said, the cluster discovery seems like a major question here (and something that I believe has been out of scope for the core dask project up until now). This seems closely related to dask-ctl https://github.com/dask-contrib/dask-ctl, cc @jacobtomlinson. Maybe a job subcommand would make more sense as a part of dask-ctl than core dask?

jacobtomlinson · 2022-07-19T08:39:26Z

@gjoseph92 this work was done at the SciPy sprints where @mrocklin, @jsignell, @jcrist, @charlesbluca and I made some longer-term plans about CLI in Dask generally. I think the plans is to migrate some of the core functionality from dask-ctl up to distributed at some point and merge it into the existing CLI tooling in a more consistent way.

@douglasdavis started some of this work to move us towards an extensible dask CLI command. @dotsdl picked up this idea around submitting scripts and ran with it.

I have an action from the sprints to write this topic up in design-docs.

dotsdl · 2022-07-26T15:37:52Z

Haven't had time to circle back to this folks. Apologies for the delay, and thank you for the feedback!

douglasdavis and others added 2 commits July 16, 2022 11:34

use new dask-cli from dask/dask#9283

7da1fb3

[WIP] Use of new dask CLI : job submission

b854dc9

Initial commit for structure and basic implementation of `dask job submit`

mrocklin reviewed Jul 17, 2022

View reviewed changes

This was referenced Jul 19, 2022

Add new CLI that is extensible dask/dask#9283

Merged

Use of new dask CLI #6735

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[WIP] Use of new dask CLI : job submission#6738

[WIP] Use of new dask CLI : job submission#6738
dotsdl wants to merge 2 commits intodask:mainfrom
dotsdl:use-new-dask-cli--job

dotsdl commented Jul 17, 2022 •

edited

Loading

Uh oh!

GPUtester commented Jul 17, 2022

Uh oh!

github-actions bot commented Jul 17, 2022

Uh oh!

mrocklin Jul 17, 2022

Uh oh!

mrocklin Jul 17, 2022

Uh oh!

mrocklin commented Jul 17, 2022

Uh oh!

gjoseph92 commented Jul 18, 2022

Uh oh!

jacobtomlinson commented Jul 19, 2022 •

edited

Loading

Uh oh!

dotsdl commented Jul 26, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Uh oh!

Conversation

dotsdl commented Jul 17, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

GPUtester commented Jul 17, 2022

Uh oh!

github-actions bot commented Jul 17, 2022

Unit Test Results

Uh oh!

mrocklin Jul 17, 2022

Choose a reason for hiding this comment

Uh oh!

mrocklin Jul 17, 2022

Choose a reason for hiding this comment

Uh oh!

mrocklin commented Jul 17, 2022

Uh oh!

gjoseph92 commented Jul 18, 2022

Uh oh!

jacobtomlinson commented Jul 19, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dotsdl commented Jul 26, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

dotsdl commented Jul 17, 2022 •

edited

Loading

jacobtomlinson commented Jul 19, 2022 •

edited

Loading