-
-
Notifications
You must be signed in to change notification settings - Fork 4.9k
Closed
Milestone
Description
Checklist
- I have verified that the issue exists against the
mainbranch of Celery. - This has already been asked to the discussions forum first.
- I have read the relevant section in the
contribution guide
on reporting bugs. - I have checked the issues list
for similar or identical bug reports. - I have checked the pull requests list
for existing proposed fixes. - I have checked the commit log
to find out if the bug was already fixed in the main branch. - I have included all related issues and possible duplicate issues
in this issue (If there are none, check this box anyway). - I have tried to reproduce the issue with pytest-celery and added the reproduction script below.
Mandatory Debugging Information
- I have included the output of
celery -A proj reportin the issue. - I have verified that the issue exists against the
mainbranch of Celery. - I have included the contents of
pip freezein the issue. - I have included all the versions of all the external dependencies required
to reproduce this bug.
Optional Debugging Information
- I have tried reproducing the issue on more than one Python version
and/or implementation. - I have tried reproducing the issue on more than one message broker and/or
result backend. - I have tried reproducing the issue on more than one version of the message
broker and/or result backend. - I have tried reproducing the issue on more than one operating system.
- I have tried reproducing the issue on more than one workers pool.
- I have tried reproducing the issue with autoscaling, retries,
ETA/Countdown & rate limits disabled. - I have tried reproducing the issue after downgrading
and/or upgrading Celery and its dependencies.
Related Issues and Possible Duplicates
N/A
Environment & Settings
software -> celery:5.5.0rc4 (immunity) kombu:5.5.0rc2 py:3.9.20
billiard:4.2.1 py-amqp:5.3.1
platform -> system:Linux arch:64bit
kernel version:6.10.14-linuxkit imp:CPython
loader -> celery.loaders.app.AppLoader
settings -> transport:amqp results:django-db
Steps to Reproduce
Required Dependencies
- Minimal Python Version: 3.9.20
- Minimal Celery Version: 5.5.0rc4
- Minimal Kombu Version: 5.5.0rc2
- Minimal Broker Version: RabbitMQ 3.13.7
- Minimal Result Backend Version: PG 17.0
- Minimal OS and/or Kernel Version: Debian Bookworm
- Minimal Broker Client Version:
py-amqp5.3.1 - Minimal Result Backend Client Version:
django-celery-results2.5.`
Python Packages
celery==5.5.0rc4
django==3.2.25
django-celery-results==2.5.1
Other Dependencies
N/A
Minimally Reproducible Test Case
- Start
celerywithREMAP_SIGTERM=SIGQUIT - Start a long running task
- Stop celery with
kill -s TERM - Observe the task completing but
celerynot exiting - Verify that
statusisFAILUREeven though it should beSUCCESS
Example in this repo:
https://github.com/daveisfera/celery_cold_shutdown
Logs:
[2025-01-21 02:29:20,211: WARNING/ForkPoolWorker-1] Value: 8
worker: Hitting Ctrl+C again will terminate all running tasks!
[2025-01-21 02:29:21,132: WARNING/MainProcess] Initiating Soft Shutdown, terminating in 16 seconds
[2025-01-21 02:29:28,220: WARNING/ForkPoolWorker-1] Done: 8
[2025-01-21 02:29:28,228: INFO/ForkPoolWorker-1] Task mysite.celery.long_task[97636c13-9c46-4880-8f31-76b277a37bf5] succeeded in 8.017255892998946s: None
[2025-01-21 02:29:37,167: ERROR/MainProcess] Task handler raised error: TimeLimitExceeded(15)
Traceback (most recent call last):
File "/usr/local/lib/python3.9/site-packages/billiard/pool.py", line 684, in on_hard_timeout
raise TimeLimitExceeded(job._timeout)
billiard.einfo.ExceptionWithTraceback:
"""
Traceback (most recent call last):
File "/usr/local/lib/python3.9/site-packages/billiard/pool.py", line 684, in on_hard_timeout
raise TimeLimitExceeded(job._timeout)
billiard.exceptions.TimeLimitExceeded: TimeLimitExceeded(15,)
"""
[2025-01-21 02:29:37,167: ERROR/MainProcess] Hard time limit (15s) exceeded for mysite.celery.long_task[97636c13-9c46-4880-8f31-76b277a37bf5]
worker: Cold shutdown (MainProcess)
NOTE: I'm not familiar enough with the testing setup to add this as a test, but it's pretty straightforward to reproduce and the example repo I linked to can show it
Expected Behavior
Tasks that complete when a cold shutdown is happening should be reported as SUCCESS
Actual Behavior
Tasks that complete when a cold shutdown is happening are timing out and reported as FAILURE