Skip to content

Traces should include SQL executed by subtasks created with asyncio.gather #1576

@simonw

Description

@simonw

I tried running some parallel SQL queries using asyncio.gather() but the SQL that was executed didn't show up in the trace rendered by https://datasette.io/plugins/datasette-pretty-traces

I realized that was because traces are keyed against the current task ID, which changes when a sub-task is run using asyncio.gather or similar.

The faceting and suggest faceting queries are missing from this trace:

image

The reason they aren't showing up in the traces is that traces are stored just for the currently executing asyncio task ID:

# asyncio.current_task was introduced in Python 3.7:
for obj in (asyncio, asyncio.Task):
current_task = getattr(obj, "current_task", None)
if current_task is not None:
break
def get_task_id():
try:
loop = asyncio.get_event_loop()
except RuntimeError:
return None
return id(current_task(loop=loop))

This is so traces for other incoming requests don't end up mixed together. But there's no current mechanism to track async tasks that are effectively "child tasks" of the current request, and hence should be tracked the same.

https://stackoverflow.com/a/69349501/6083 suggests that you pass the task ID as an argument to the child tasks that are executed using asyncio.gather() to work around this kind of problem.

Originally posted by @simonw in #1518 (comment)

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions