Untangled Development

Django & FastAPI — Python async in web applications

In a previous post I had asked: do you need a task queue in your web app?.

That was in 2020 and since then async programming in both Python and Django have come a long way.

Does the concept of async replace a task queue? It depends.

What is async?

I think the simplest way to explain it is to compare it to synchronous, or “sync”, programming.

Let’s assume the sleep calls below represent an I/O bound task:

# demo_sync.py

import time


def say_hello(x: int):
    time.sleep(x)  # simulate I/O blocking call
    print(f"{time.strftime('%X')} - Hello world {x}")


print(f"{time.strftime('%X')} - Started")
for i in range(1, 4):
    say_hello(i)

In the above snippet we:

  1. wait for 1 second using sleep to simulate an I/O blocking call
  2. print the current date.

Here’s the above snippet’s output:

22:47:00 - Started
22:47:01 - Hello world 1
22:47:03 - Hello world 2
22:47:06 - Hello world 3

So a print happens every one second, as shown in the below diagram1:

Figure 2. Sync snippet: task execution over time.

Figure 2. Sync snippet: task execution over time.

Let’s now see the same snippet, but reworked the an async way (using Python 3.12):

# demo_async.py

import asyncio
import time


async def say_hello(x: int):
    await asyncio.sleep(x)  # simulate I/O blocking call
    print(f"{time.strftime('%X')} - Hello world {x}")


async def main():
    print(f"{time.strftime('%X')} - Started")
    # asyncio.TaskGroup runs multiple instances of say_hello concurrently
    async with asyncio.TaskGroup() as task_group:
        for i in range(1, 4):
            task_group.create_task(say_hello(i))


# To run the async function in an event loop
asyncio.run(main())

The above snippet runs the “say hello” task asynchronously. Let’s look at its ouptut below:

22:48:00 - Started
22:48:01 - Hello world 1
22:48:02 - Hello world 2
22:48:03 - Hello world 3

Since say_hello is now a coroutine, the thread is not blocked waiting for the I/O operation to complete. Rather it goes on processing.:

Figure 4. Async snippet: task execution over time.

Figure 4. Async snippet: task execution over time.

The below diagram shows the execution of I/O blocking tasks within the same Python process:

Figure 5. Sync vs async: task execution over time.

Figure 5. Sync vs async: task execution over time.

It appears that the tasks are running concurrently, but they are not.

There is still only one thread doing all the work. But, as opposed to sync, in async the thread (eventloop) is not blocked by I/O operations. Instead it can continue doing work when a task is waiting for I/O to complete.

This avoids us from having to deal with all the concurrency issues we would have with real threads. Such as atomic variable updates, semaphores/locking, etc.

Great!

So can we replace our task queues with async Python? After all, task queues are all about adding async behaviour to our HTTP request/response cycle.

When to async?

The rate at which a process progresses can be limited in one of these ways, source:

  • CPU bound
  • I/O bound
  • memory bound
  • cache bound

But our decision usually depends on whether our bottleneck is caused by CPU or I/O. To recap:

CPU Bound means the rate at which a process progresses is limited by the speed of the CPU. A task that performs calculations on a small set of numbers, for example multiplying small matrices, is likely to be CPU bound.

I/O Bound means the rate at which a process progresses is limited by the speed of the I/O subsystem. A task that processes data from disk, for example, counting the number of lines in a file is likely to be I/O bound.

If our bottleneck is I/O, such as an HTTP API call over the network, then it makes sense to use async.

If the bottlenck is CPU, then offloading to a task queue is the better option. When we execute a task via task queue, we are triggering a task in a separate Python process.

This separate process can be on the same host, or even on another host on the network. In any case it’s executed by a separate process. In Django this implies a task queue such as the (in)famous Celery or Huey

To conclude

The above aims to show the most basic difference between Python async and task queues for web application purposes.

Python async and asynchronous task queues are different tools. With different features beyond the basics described above.

While both have “async” behaviour they serve a different purpose. And solve a different class of problem as shown above.

Credits

Massive thanks to Ronald Moesbergen, seasoned DevOps/Cloud Engineer, for explaining Python async concepts and reviewing this article.


  1. Diagrams above drawn using the excellent excalidraw.com

Comments !