Skip to content

Use critical section to protect pgconn ptr#1233

Merged
dvarrazzo merged 4 commits intopsycopg:masterfrom
lysnikolaou:pgconn-critical-section
Jan 31, 2026
Merged

Use critical section to protect pgconn ptr#1233
dvarrazzo merged 4 commits intopsycopg:masterfrom
lysnikolaou:pgconn-critical-section

Conversation

@lysnikolaou
Copy link
Copy Markdown
Contributor

This PR adds critical sections to the PGconn class so that race condition between calling into the libpq bindings and calling close which sets the _pgconn_ptr to NULL are eliminated.

Also add one high-level and one low-level tests that trigger TSAN warnings without this PR.

@lysnikolaou lysnikolaou force-pushed the pgconn-critical-section branch 2 times, most recently from d15da3c to 71c9fd7 Compare December 8, 2025 15:02
Copy link
Copy Markdown

@ngoldbaum ngoldbaum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caught a couple minor issues. I was looking at the HTML annotated report from setting annotate=True in the cythonize call and making sure all the critical sections have no Python overhead:

diff --git a/psycopg_c/build_backend/psycopg_build_ext.py b/psycopg_c/build_backend/psycopg_build_ext.py
index b4293fe7..8d137304 100644
--- a/psycopg_c/build_backend/psycopg_build_ext.py
+++ b/psycopg_c/build_backend/psycopg_build_ext.py
@@ -63,5 +63,5 @@ class psycopg_build_ext(build_ext):
                     "always_allow_keywords": False,
                     "freethreading_compatible": True,
                 },
-                annotate=False,  # enable to get an html view of the C module
+                annotate=True,  # enable to get an html view of the C module
             )

My other big comment is that taking a critical section and then immediately doing with nogil is possibly an issue. Are the per-object locks getting released when the thread detaches from the interpreter? The critical section docs talk about "possibly blocking calls" and I'm not sure if that matters for the thread state APIs in the free-threaded build.

If the critical sections are getting released immediately, there's no reason to add them here and they'll give the reader a false sense of security. It may be the case that new locking is needed to make these APIs safe. Also given that the GIL is being released, if these APIs aren't safe to use from multiple threads, then it may be possible to cause data races on the GIL-enabled build right now.

Comment thread psycopg_c/psycopg_c/pq/pgconn.pyx Outdated
Comment thread psycopg_c/psycopg_c/pq/pgconn.pyx Outdated
Comment thread psycopg_c/psycopg_c/pq/pgconn.pyx Outdated
Copy link
Copy Markdown
Member

@dvarrazzo dvarrazzo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes look ok, not invasive and they do the right thing w.r.t. managing the connection pointer.

Just asking to check if it's possible to use a couple of more modern idioms, for the rest the code seems good to go.

Comment thread psycopg_c/psycopg_c/pq/pgconn.pyx Outdated
Comment thread psycopg_c/psycopg_c/pq/pgconn.pyx Outdated
Comment thread psycopg_c/psycopg_c/pq/pgconn.pyx
@dvarrazzo
Copy link
Copy Markdown
Member

You might want to rebase this MR on master to run the CI successfully (macos 13 runners were dropped).

@lysnikolaou
Copy link
Copy Markdown
Contributor Author

Are the per-object locks getting released when the thread detaches from the interpreter?

Yes, the Cython docs also mention that and suggest not doing with nogil inside critical sections. I think there's two possibilities here:

  1. Remove the nogil block and let the gil-enabled build run these with the GIL held.
  2. Document that calling PGconn.finish while there's queries currently executing is unsupported. @dvarrazzo will have a better idea if there's ever a usecase that needs this.

@ngoldbaum
Copy link
Copy Markdown

Document that calling PGconn.finish while there's queries currently executing is unsupported. @dvarrazzo will have a better idea if there's ever a usecase that needs this.

Not just finish - all the spots where you're using with.cython.critical_section(...), nogil or spots where you're detacting from the runtime inside critical sections will release the locks in the critical section.

If it really isn't safe to call into the libpq API without any external synchronization, then releasing the GIL there is incorrect without some other synchronization in the extension. So the fix would be to add a per-connection lock (for example) and acquire that.

If it's safe to call into libpq with the GIL released in the sections where it's released right now, then I don't think adding critical sections will help. It adds overhead and doesn't add any safety plus it makes the code more complicated.

If the desire is only to prevent new issues that aren't possible on the GIL-enabled build, then I think also the correct thing is to remove the new critical sections you've added here that do with nogil inside.

This PR adds critical sections to the PGconn class so that
race condition between calling into the libpq bindings and
calling `close` which sets the `_pgconn_ptr` to NULL are
eliminated.

Also add one high-level and one low-level tests that trigger TSAN
warnings without this PR.
@lysnikolaou
Copy link
Copy Markdown
Contributor Author

@dvarrazzo Friendly ping on this one.

@dvarrazzo dvarrazzo force-pushed the pgconn-critical-section branch from 5f37ec8 to d7dc6c7 Compare January 31, 2026 00:08
@dvarrazzo dvarrazzo merged commit f5d90fa into psycopg:master Jan 31, 2026
49 checks passed
@dvarrazzo
Copy link
Copy Markdown
Member

Hello @lysnikolaou! I sincerely apologise for the delay, it had very busy weeks.

Merged now, thank you very much!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants