Conversation
7c9b3e3 to
4865250
Compare
fjetter
left a comment
There was a problem hiding this comment.
If this helps, I'm +1. I think the complexity here is not too bad, even if the patch is not accepted in CPython.
Regarding the OSX slowness. Under which conditions do we experience any slowness? I'm worried about further slowing down our CI since OSX jobs are already the slowest.
Is this "slowness" significant? Would a user working on a mac book notice this when connecting to a cluster?
Unit Test ResultsSee test report for an extended history of previous test failures. This is useful for diagnosing flaky tests. 15 files ±0 15 suites ±0 6h 46m 1s ⏱️ + 11m 11s Results for commit 09c75e9. ± Comparison against base commit 3f91ad3. ♻️ This comment has been updated with latest results. |
The numeric path should never hit the kernel or a lock, it's supposed to be implemented with just plain bytes parsing however our choice of what to do when this is slow is limited - we don't know if it's slow until we call it, and we can only know it was too slow after the damage has already been done and we wakeup eg something like this: def _getaddrinfo_debug(loop, host, port, family, type, proto, flags):
# based on https://github.com/python/cpython/blob/63140b445e4a303df430b3d60c1cd4ef34f27c03/Lib/asyncio/base_events.py#L833-L856
t0 = loop.time()
addrinfo = socket.getaddrinfo(host, port, family, type, proto, flags)
dt = loop.time() - t0
if dt >= loop.slow_callback_duration:
msg = [f"{host}:{port!r}"]
if family:
msg.append(f'family={family!r}')
if type:
msg.append(f'type={type!r}')
if proto:
msg.append(f'proto={proto!r}')
if flags:
msg.append(f'flags={flags!r}')
msg = ', '.join(msg)
raise RuntimeError(f'Getting address info {msg} took {dt * 1e3:.3f}ms: {addrinfo!r}')
return addrinfo |
def _getaddrinfo_debug(loop, host, port, family, type, proto, flags):
# based on https://github.com/python/cpython/blob/63140b445e4a303df430b3d60c1cd4ef34f27c03/Lib/asyncio/base_events.py#L833-L856
t0 = loop.time()
addrinfo = socket.getaddrinfo(host, port, family, type, proto, flags)
dt = loop.time() - t0
if dt >= loop.slow_callback_duration:
msg = [f"{host}:{port!r}"]
if family:
msg.append(f'family={family!r}')
if type:
msg.append(f'type={type!r}')
if proto:
msg.append(f'proto={proto!r}')
if flags:
msg.append(f'flags={flags!r}')
msg = ', '.join(msg)
raise RuntimeError(f'Getting address info {msg} took {dt * 1e3:.3f}ms: {addrinfo!r}')
return addrinfo |
gjoseph92
left a comment
There was a problem hiding this comment.
we could time the blocking call and throw an error to let the user know
This is vaguely tempting to me mostly because it costs very little to check, but could be nice to have running in CI. If the error never shows up then it can make us more confident this was the right thing to do.
fjetter
left a comment
There was a problem hiding this comment.
Given that this only affects OSX machines, if the problem even exists, I'm OK with this without any additional debugging logs.
The only use case for OSX are LocalClusters and Clients and I don't think slow DNS should be a topic for local clusters.
We do have the "tick warning" implemented and should already get warnings if the loop is blocked for a prolonged period of time on a cluster.
Clients would not be impacted by this since the event loop is most of the time idle anyhow on clients.
should help with #6846
pre-commit run --all-files