-
-
Notifications
You must be signed in to change notification settings - Fork 5k
Description
Checklist
- I have checked the issues list
for similar or identical bug reports. - I have checked the pull requests list
for existing proposed fixes. - I have checked the commit log
to find out if the bug was already fixed in the master branch. - I have included all related issues and possible duplicate issues in this issue
(If there are none, check this box anyway).
Related Issues and Possible Duplicates
Related Issues
- PR: Use DynamoDB TTL to auto-expire old task results #5275 (status is open, but feature is already present since Celery v4.4.0)
Possible Duplicates
- None
Description
First off, documentation is not especially wrong on the use of DynamoDB as a results backend. However, it is incomplete in a significant way. Because I manage the DynamoDB table through IaC I did not realize that defining a sort key would break compatibilty with Celery.
When you use a DynamoDB table with a sort key you can not use Celery v4.4 versions >= v4.4.3. The reason is commit c711047 which introduces a query of existing Celery result status. When you have a sort key defined for your DynamoDB table, the primary key is no longer just the partition key ('id'), but a combination of the partition and sort keys ('id' and 'timestamp' in my case).
Suggestions
It would be fun to add sort key support to the DynamoDB results backend, but I fear it will not be possible. For example, in my case I will need to know the value of the 'timestamp' in addition to the 'id' of the result item we need to retrieve. I don't believe Celery has knowledge of the 'timestamp' we are looking for at the time the query from commit c711047 is performed.
So my suggestion is simple: Let's add a line in the documentation about DynamoDB use as results backend that warns us that you cannot use a sort key on the table. Luckily, you could still add a global secondary index (GSI) if you want to perform queries based on something other than the partition key. A GSI will not break the primary key schema for the queries from Celery.