-
-
Notifications
You must be signed in to change notification settings - Fork 4.9k
Description
When using the Django-Celery fixup to run background tasks for a Django web service, the tasks in the background do not respect the settings in Django for PostgreSQL (possibly other) connections. Every task will always create a new connection no matter the Django settings. Although it is possible to bypass this with the environment variable CELERY_DB_REUSE_MAX, it is preferred for it to follow the settings given in Django.
Checklist
- I have included the output of
celery -A proj reportin the issue.
(if you are not able to do this, then at least specify the Celery
version affected).
Celery 4.0.2 with potentially all versions of Django (tested on 1.10.3 and 1.11.2) - I have verified that the issue exists against the
masterbranch of Celery.
This line causes this "issue":
https://github.com/celery/celery/blob/master/celery/fixups/django.py#L186
Steps to reproduce
Note that these steps require some monitoring service to be used, we have New Relic.
Note also that we use Heroku for this app in question.
- Have a web facing process with Django that connects to your PostgreSQL database for ORM purposes
- Have a worker process that also connects to the PostgreSQL for ORM purposes
- Have the DATABASES['default']['CONN_MAX_AGE'] setting set to anything that isn't 0 (easiest to see with
Nonefor persistent connections) - Make multiple requests to the web portion of Django to cause some ORM activity (easiest to see if it happens on every request)
- Get multiple tasks to execute on the worker that will cause some ORM activity (easiest to see if it happens on every task)
- Use your monitoring service (New Relic in our case) to view a breakdown of all of the requests and worker activity. In New Relic you can check this using the transaction tracing; select the endpoint/task that made the db queries and check the breakdown.
Expected behavior
psycopg2:connect would occur rarely with an average calls per transaction <<< 1
Actual behavior
psycopg2:connect occurs very rarely with an average calls per transaction of <<< 1 for the web processes.
psycopg2:connect occurs every time with an average calls per transaction of 1 for the worker processes.
Potential Resolution
With my limited knowledge of Celery's inner workings, it feels like a fairly simple fix that I could make on a PR myself, but I wanted some input before I spend the time setting that all up.
This fix seems to work when monkey patched into the DjangoWorkerFixup class.
def _close_database(self):
try:
# Use Django's built in method of closing old connections.
# This ensures that the database settings are respected.
self._db.close_old_connections()
except AttributeError:
# Legacy functionality if we can't use the old connections for whatever reason.
for conn in self._db.connections.all():
try:
conn.close()
except self.interface_errors:
pass
except self.DatabaseError as exc:
str_exc = str(exc)
if 'closed' not in str_exc and 'not connected' not in str_exc:
raise
celery.fixups.django.DjangoWorkerFixup._close_database = _close_database