Skip to content

Invalid format of 'date_done' field in celery.task_results with backend mongodb #8431

@asukero

Description

@asukero

Checklist

  • I have verified that the issue exists against the main branch of Celery.
  • This has already been asked to the discussions forum first.
  • I have read the relevant section in the
    contribution guide
    on reporting bugs.
  • I have checked the issues list
    for similar or identical bug reports.
  • I have checked the pull requests list
    for existing proposed fixes.
  • I have checked the commit log
    to find out if the bug was already fixed in the main branch.
  • I have included all related issues and possible duplicate issues
    in this issue (If there are none, check this box anyway).

Mandatory Debugging Information

  • I have included the output of celery -A proj report in the issue.
    (if you are not able to do this, then at least specify the Celery
    version affected).
  • I have verified that the issue exists against the main branch of Celery.
  • I have included the contents of pip freeze in the issue.
  • I have included all the versions of all the external dependencies required
    to reproduce this bug.

Environment & Settings

Celery version: 5.3.1

celery -A test_celery_result report Output:

software -> celery:5.3.1 (emerald-rush) kombu:5.3.1 py:3.8.16
            billiard:4.1.0 redis:4.6.0
platform -> system:Linux arch:64bit, ELF
            kernel version:5.19.0-46-generic imp:CPython
loader   -> celery.loaders.app.AppLoader
settings -> transport:redis results:mongodb
[...]
CELERY_BROKER_TRANSPORT_OPTIONS: 
 'socket_keepalive': True, 'socket_keepalive_options': {4: 600, 5: 60, 6: 5}}
CELERY_BROKER_URL: 'redis://redis.local:7000/0'
CELERY_INCLUDE: ['test_celery_result.tasks']
CELERY_QUEUE_NAME: 'test_celery_result'
CELERY_REDIS: 
 'host': 'redis.local', 'port': 7000}
[...]
is_overridden: <bound method Settings.is_overridden of <Settings "test_celery_result.settings">>
deprecated_settings: None
task_default_queue: 'test_celery_result'
enable_utc: False
result_backend: 'mongodb'
result_expires: datetime.timedelta(seconds=15)
mongodb_backend_settings: 
    'database': '********',
    'host': ['mongo-replica'],
    'port': 27017,
    'taskmeta_collection': 'celery_task_result'}
beat_schedule: 
    'celery.backend_cleanup': {   'schedule': 60,
                                  'task': 'celery.backend_cleanup'},
    'dummy_task': {'schedule': 15, 'task': 'dummy_task'}}

Steps to Reproduce

Required Dependencies

  • Minimal Python Version: 3.6 or higher
  • Minimal Celery Version: 4.3.0 or higher
  • Minimal Kombu Version: Unknown
  • Minimal Broker Version: Unknown
  • Minimal Result Backend Version: Mongo 4.4 or higher
  • Minimal OS and/or Kernel Version: Unknown
  • Minimal Broker Client Version: Unknown
  • Minimal Result Backend Client Version: pymongo 3.14 or higher

Python Packages

pip freeze Output:

amqp==5.1.1
asgiref==3.7.2
async-timeout==4.0.2
backports.zoneinfo==0.2.1
billiard==4.1.0
bleach==6.0.0
celery==5.3.1
certifi==2023.7.22
cffi==1.15.1
charset-normalizer==3.2.0
click==8.1.6
click-didyoumean==0.3.0
click-plugins==1.1.1
click-repl==0.3.0
cryptography==41.0.3
Django==3.2.20
django-cors-headers==4.2.0
django-environ==0.10.0
django-formset-js==0.5.0
django-jquery-js==3.1.1
django-redis-sessions==0.6.2
django-test-addons-adv==1.1.1
dnspython==2.4.1
idna==3.4
Jinja2==3.1.2
kombu==5.3.1
packaging==23.1
prompt-toolkit==3.0.39
pyasn1==0.5.0
pycparser==2.21
pyhcl==0.4.4
pymongo==4.4.1
python-dateutil==2.8.2
pytz==2022.1
PyYAML==6.0.1
redis==4.6.0
requests==2.31.0
sentinels==1.0.0
single-beat==0.6.3
six==1.16.0
sqlparse==0.4.4
types-PyYAML==6.0.12.11
typing_extensions==4.7.1
tzdata==2023.3
urllib3==1.26.16
vine==5.0.0
wcwidth==0.2.6
webencodings==0.5.1

Minimally Reproducible Test Case

Details

  • 1. Set up a celery project with mongodb as backend
  • 2. Set
    app.conf.result_expires = timedelta(seconds=60)
  • 2. Set up a scheduled task
    app.conf.beat_schedule = {
        "dummy_task": {
            "task": "dummy_task",
            "schedule": 15
        },
    }
  • 3. Start celery worker and celery beat
  • 4. Open shell on mongodb backend and see that db.task_result.count() never resets to 0

Expected Behavior

task_result collection on mongo database shoud be cleaned every 60s according to result_expires configuration

Actual Behavior

There is an issue with the format of field date_done in task_result collection. Task results meta are retrieved with the method _get_result_meta from base.py which argument format_date is set to True by default. date_done field will be converted from datetime object to str and then inserted as a string in mongodb database.

And so when cleanup() method is called on MongoBackend, it will compare date_done field with datetime object from self.app.now() and will never match.

self.collection.delete_many(
        {'date_done': {'$lt': self.app.now() - self.expires_delta}},
    )
# self.app.now() return datetime object while date_done is stored as string
> db.task_result.findOne()
{
        "_id" : "f16bd459-b858-4ae8-afb5-1ceab0e50326",
        "status" : "SUCCESS",
        "result" : "\"SUCCESS\"",
        "traceback" : null,
        "children" : [ ],
        "date_done" : "2023-08-08T09:03:34.974924" // should be ISODate("2023-08-08T09:03:34.9749Z")
}

A simple fix would be to set format_date to False when calling self.get_result_meta in MongoBackend._strore_result

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions