82

I have a python script that I suspect that there is a deadlock. I was trying to debug with pdb but if I go step by step it doesn't get the deadlock, and by the output returned I can see that it's not being hanged on the same iteration. I would like to attach my script to a debugger only when it gets locked, is it possible? I'm open to use other debuggers if necessary.

5

7 Answers 7

81

At this time, pdb does not have the ability to halt and begin debugging on a running program. You have a few other options:

GDB

You can use GDB to debug at the C level. This is a bit more abstract because you're poking around Python's C source code rather than your actual Python script, but it can be useful for some cases. The instructions are here: https://wiki.python.org/moin/DebuggingWithGdb. They are too involved to summarise here.

Third-Party Extensions & Modules

Just googling for "pdb attach process" reveals a couple of projects to give PDB this ability:
Pyringe: https://github.com/google/pyringe
Pycharm: https://blog.jetbrains.com/pycharm/2015/02/feature-spotlight-python-debugger-and-attach-to-process/
This page of the Python wiki has several alternatives: https://wiki.python.org/moin/PythonDebuggingTools


For your specific use case, I have some ideas for workarounds:

Signals

If you're on unix, you can use signals like in this blog post to try to halt and attach to a running script.

This quote block is copied directly from the linked blog post:

Of course pdb has already got functions to start a debugger in the middle of your program, most notably pdb.set_trace(). This however requires you to know where you want to start debugging, it also means you can't leave it in for production code.

But I've always been envious of what I can do with GDB: just interrupt a running program and start to poke around with a debugger. This can be handy in some situations, e.g. you're stuck in a loop and want to investigate. And today it suddenly occurred to me: just register a signal handler that sets the trace function! Here the proof of concept code:

import os
import signal
import sys
import time    

def handle_pdb(sig, frame):
    import pdb
    pdb.Pdb().set_trace(frame)    

def loop():
    while True:
        x = 'foo'
        time.sleep(0.2)

if __name__ == '__main__':
    signal.signal(signal.SIGUSR1, handle_pdb)
    print(os.getpid())
    loop()

Now I can send SIGUSR1 to the running application and get a debugger. Lovely!

I imagine you could spice this up by using Winpdb to allow remote debugging in case your application is no longer attached to a terminal. And the other problem the above code has is that it can't seem to resume the program after pdb got invoked, after exiting pdb you just get a traceback and are done (but since this is only bdb raising the bdb.BdbQuit exception I guess this could be solved in a few ways). The last immediate issue is running this on Windows, I don't know much about Windows but I know they don't have signals so I'm not sure how you could do this there.

Conditional Breakpoints and Loops

You may still be able to use PDB if you don't have signals available, if you wrap your lock or semaphore acquisitions in a loop that increments a counter, and only halt when the count has reached a ridiculously large number. For example, say you have a lock that you suspect is part of your deadlock:

lock.acquire() # some lock or semaphore from threading or multiprocessing

Rewrite it this way:

count = 0
while not lock.acquire(False): # Start a loop that will be infinite if deadlocked
    count += 1

    continue # now set a conditional breakpoint here in PDB that will only trigger when
             # count is a ridiculously large number:
             # pdb> <filename:linenumber>, count=9999999999

The breakpoint should trigger when when count is very large, (hopefully) indicating that a deadlock has occurred there. If you find that it's triggering when the locking objects don't seem to indicate a deadlock, then you may need to insert a short time delay in the loop so it doesn't increment quite so fast. You also may have to play around with the breakpoint's triggering threshold to get it to trigger at the right time. The number in my example was arbitrary.

Another variant on this would be to not use PDB, and intentionally raise an exception when the counter gets huge, instead of triggering a breakpoint. If you write your own exception class, you can use it to bundle up all of the local semaphore/lock states in the exception, then catch it at the top-level of your script to print out right before exiting.

File Indicators

A different way you can use your deadlocked loop without relying on getting counters right would be to write to files instead:

import time

while not lock.acquire(False): # Start a loop that will be infinite if deadlocked
    with open('checkpoint_a.txt', 'a') as fo: # open a unique filename
        fo.write("\nHit") # write indicator to file
        time.sleep(3)     # pause for a moment so the file size doesn't explode

Now let your program run for a minute or two. Kill the program and go through those "checkpoint" files. If deadlock is responsible for your stalled program, the files that have the word "hit" written in them a bunch of times indicate which lock acquisitions are responsible for your deadlock.

You can expand the usefullness of this by having the loop print variables or other state information instead of just a constant. For example, you said you suspect the deadlock is happening in a loop but don't know what iteration it's on. Have this lock loop dump your loop's controlling variables or other state information to identify the iteration the deadlock occured on.

Sign up to request clarification or add additional context in comments.

4 Comments

This requires I rewrite my application. But I just searched for this when one of my python program is running and I want to debug it right now without restarting.
@shiplu.mokadd.im Well, yeah... You will need to add some code to use the workarounds I've suggested. But see my edit for some suggestions of third-party tools you may be able to use instead.
already had PyCharm installed, just wanted to change the log-level for the Python logging module.. took a minute or two for the process to show as 'paused', but then I saw some variables in the debugger... opened the code interpreter, pasted, executed, and bam... after re-starting the process my debug logging was showing up!
fail in Linux/Ubuntu
16

There is a clone of pdb, imaginatively called pdb-clone, which can attach to a running process.

You simply add from pdb_clone import pdbhandler; pdbhandler.register() to the code for the main process, and then you can start pdb with pdb-attach --kill --pid PID.

1 Comment

ModuleNotFoundError: No module named 'readline' -> python -m pip install readline -> error: this module is not meant to work on Windows -> python -m pip install pyreadline -> AttributeError: module 'signal' has no attribute 'SIGUSR1'
16

You can use my project madbg. It is a python debugger that allows you to attach to a running python program and debug it in your current terminal. It is similar to pyrasite and pyringe, but supports python3, doesn't require gdb, and uses IPython for the debugger (which means pdb with colors and autocomplete).

For example, to see where your script is stuck, you could run:

madbg attach <pid>

And in the debugger shell, enter: bt

1 Comment

Thanks, it works great. I do have a question: when I set a breakpoint and c for the program to continue, I can't use Ctrl-C to go back to the debugger, instead I just get ^C typed into the terminal (I used kill -INT to solve it). I know it's not a problem with madbg, but do you know how to make the terminal send the Ctrl-C to the process?
9

VSCode supports debugging a locally running python process.

If you don't have a launch.json file, you just need to start debugging (F5) and then you'll see the following options.

enter image description here

Selecting "Attach using Process ID" will add the following to your launch.json

  {
  "version": "0.2.0",
  "configurations": [
    {
      "name": "Python: Attach using Process Id",
      "type": "python",
      "request": "attach",
      "processId": "${command:pickProcess}",
    },
}

Now when you debug using that configuration you can select the local python process you want to debug.

enter image description here

1 Comment

It looks like the process needs to be started with debugpy for this work. So, not very usable.
7

Use pyrasite:

>>> pyrasite 172483 dump_stacks.py

...where 172483 is the PID of the running python process. The python process will then print a stack trace for each thread. One can send arbitrary Python code to be executed or open a shell.

This is great for debugging dead locks. One can even install pyrasite after the hanging process has been started. But be aware that you should install it in the same environment to make it work.

This is not the only tool available, but for some reason it seems to be really hard to stumble over it accidentally. It is old, but works like a charm for Python 2 and 3.

This tool may not support win32 like most injectors using unix headers for native C functions, see e.g. this open issue.

2 Comments

Worked beautifully for me on Python 3.11, even when installed in a different virtual environment as the running process's.
didn't work for me on RPi with python 3.9.2 (process didn't print anything)
5

With the release of CPython 3.14, pdb can attach to a process ID (PID) with:

python -m pdb -p 1234

Then you can enter w for a stack trace, n to continue to the next line, etc.

Comments

0

If your only goal is to get the stack trace of a program that's stuck, and you don't have the ability to add code to your program (or the bug is hard to reproduce and you want to investigate now instead of later:

Python 3.14 and newer: See [https://stackoverflow.com/a/79737770/1207791](Vinayak's Answer).

Python 3.13 and older:

Use py-spy.
I've successfully used it to attach a completely un-prepared process that was stuck on windows, on a machine that had only python 3.13 and uv installed:

uv run --with py-spy py-spy dump -p $pid

enter image description here

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.