Allow actor exceptions to propagate#4232
Conversation
|
This seems to take more code than I would have thought, to wrap and unwrap the results dict in various places. |
e2483ad to
a3d0cff
Compare
distributed/worker.py
Outdated
| executing={ | ||
| key: now - self.tasks[key].start_time | ||
| for key in self.active_threads.values() | ||
| if key in self.tasks |
There was a problem hiding this comment.
This line is a little worrysome. Without it, you get KeyError here during heartbeat after a failed actor action, but I don't really know why. Should the "active thread" be cleaned up somewhere?
|
CI fails on apparently unrelated timeout in test_broken_worker_during_computation on py36 only |
|
If this still passes, I would really like to see it merged! I may offer a "please object" timeframe. |
|
@jrbourbeau would you be able to take a look or do you know who would be able to? |
|
OK, all passed except some unrelated and presumably flaky steal test in one run. |
| def prop(self): | ||
| raise MyException | ||
|
|
||
| with cluster(nworkers=2) as (cl, w): |
There was a problem hiding this comment.
Is there a reason to use cluster here instead of @gen_cluster like most of the other test in this module?
There was a problem hiding this comment.
It was to test the sync API, which is more typical for actors. I added an async version immediately below (could remove this one, if you like).
There was a problem hiding this comment.
It was to test the sync API, which is more typical for actors. I added an async version immediately below (could remove this one, if you like).
To be clear here, the sync api is more common for everything. We prefer the async tests because they are faster/easier on CI and allow for greater debuggability. Adding sync tests too is fine if we want to be extra careful, but in general we prefer async tests. In general if async tests I usually have confidence that sync works just as well, unless I'm explicitly writing code to handle synchronization.
What's here is great though.
There was a problem hiding this comment.
What's here is great though.
I don't mind keeping it or not. Writing the sync version first probably reflects how I initially tested by hand. It was some time ago, so I can't remember if there was any other reason, given that the async version is effectively identical and works just fine.
jrbourbeau
left a comment
There was a problem hiding this comment.
Thanks for the updates @martindurant! Just a few small comments on test_actor.py, otherwise I think this is good to merge
distributed/tests/test_actor.py
Outdated
| from distributed.utils_test import ( # noqa: F401 | ||
| async_wait_for, | ||
| cluster, | ||
| cluster_fixture, | ||
| gen_cluster, | ||
| loop, | ||
| ) |
There was a problem hiding this comment.
Following #4888, all pytest fixtures in distributed.utils_test are globally available. So there's no need for the cluster_fixture or loop imports (and hopefully we can drop # noqa: F401 too).
|
Do we have a tracking issue for ongoing CI failures? The two here are not actor related, as far as I can tell. |
|
There are a variety of issues, generally one per failure
…On Mon, Jun 14, 2021 at 8:05 AM Martin Durant ***@***.***> wrote:
Do we have a tracking issue for ongoing CI failures? The two here are not
actor related, as far as I can tell.
—
You are receiving this because your review was requested.
Reply to this email directly, view it on GitHub
<#4232 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AACKZTBDC2GSCM3V3PIUFRTTSX5AFANCNFSM4TQXARUA>
.
|
Yeah, we keep track of these issues with a |
Co-authored-by: James Bourbeau <jrbourbeau@users.noreply.github.com>
jrbourbeau
left a comment
There was a problem hiding this comment.
Thanks @martindurant! Will merge once CI finishes
Follows on from #4225
Closes dask/dask#7626