Skip to content

Arithmetic Overflow Error with MSSQL Backend due to Integer 'id' Field Limitation #8634

@dxdc

Description

@dxdc

Checklist

  • I have verified that the issue exists against the main branch of Celery.
  • This has already been asked to the discussions forum first.
  • I have read the relevant section in the
    contribution guide
    on reporting bugs.
  • I have checked the issues list
    for similar or identical bug reports.
  • I have checked the pull requests list
    for existing proposed fixes.
  • I have checked the commit log
    to find out if the bug was already fixed in the main branch.
  • I have included all related issues and possible duplicate issues
    in this issue (If there are none, check this box anyway).

Mandatory Debugging Information

  • I have included the output of celery -A proj report in the issue.
    (if you are not able to do this, then at least specify the Celery
    version affected).
  • I have verified that the issue exists against the main branch of Celery.
  • I have included the contents of pip freeze in the issue.
  • I have included all the versions of all the external dependencies required
    to reproduce this bug.

Optional Debugging Information

  • I have tried reproducing the issue on more than one Python version
    and/or implementation.
  • I have tried reproducing the issue on more than one message broker and/or
    result backend.
  • I have tried reproducing the issue on more than one version of the message
    broker and/or result backend.
  • I have tried reproducing the issue on more than one operating system.
  • I have tried reproducing the issue on more than one workers pool.
  • I have tried reproducing the issue with autoscaling, retries,
    ETA/Countdown & rate limits disabled.
  • I have tried reproducing the issue after downgrading
    and/or upgrading Celery and its dependencies.

Related Issues and Possible Duplicates

Related Issues

Possible Duplicates

  • None

Environment & Settings

Celery version:

celery report Output:

software -> celery:5.3.5 (emerald-rush) kombu:5.3.3 py:3.11.6
            billiard:4.2.0 redis:5.0.1
platform -> system:Linux arch:64bit
            kernel version:6.4.16-linuxkit imp:CPython
loader   -> celery.loaders.app.AppLoader
settings -> transport:redis results:db+mssql+pyodbc://sa:**@mssql:1433/docker

broker_url: 'redis://redis:6379/0'
result_backend: 'db+mssql+pyodbc://sa:********@mssql:1433/docker'
deprecated_settings: None
result_extended: True
result_expires: None
result_backend_always_retry: True
result_backend_thread_safe: True
database_table_names: 
 'group': 'results_groupmeta', 'task': 'results_taskmeta'}

Steps to Reproduce

  • Use MSSQL Server as database backend

Required Dependencies

  • Minimal Python Version: N/A or Unknown
  • Minimal Celery Version: N/A or Unknown
  • Minimal Kombu Version: N/A or Unknown
  • Minimal Broker Version: N/A or Unknown
  • Minimal Result Backend Version: N/A or Unknown
  • Minimal OS and/or Kernel Version: N/A or Unknown
  • Minimal Broker Client Version: N/A or Unknown
  • Minimal Result Backend Client Version: N/A or Unknown

Python Packages

pip freeze Output:

SQLAlchemy==2.0.23
kombu==5.3.3
celery==5.3.5

Other Dependencies

Details

N/A

Minimally Reproducible Test Case

Details

celery = Celery(
    "tasks",
    broker=broker_url,
    result_backend=backend_url, # this must be type mssql server
)

Expected Behavior

Celery crashes when creating the schema for the results backend table when using sqlalchemy. The problem appears to be a mismatch in type between sa.Sequence and sa.Integer for the id column on Task and TaskSet.

You can see a mismatch between the id column type (int) and the task_id_sequence (bigint).

This simple monkey patch solves the issue, and while this could be a bug in sqlalchemy, I suspect it could be an issue with the schema definition itself so I wanted to ask here first. I've linked (above) what appears to be a related bug in SQLalchemy that was recently fixed.

from celery.backends.database import models
from sqlalchemy import BigInteger

# Monkey Patch for MSSQL Compatibility in Celery's Result Backend
# ---------------------------------------------------------------
# This code addresses a compatibility issue when using Microsoft SQL Server (MSSQL)
# as the backend for Celery's task results. In MSSQL, the 'id' field in the results
# tables can exceed the maximum value for an Integer data type, leading to overflow errors.
# To prevent this, the data type of the 'id' column in both the Task and TaskSet models
# is changed from Integer to BigInteger. This alteration ensures that the 'id' field can
# accommodate larger values, aligning with MSSQL's capacity for handling big integers.
# This patch modifies the column type directly in the SQLAlchemy table definition
# for Celery's backend models before any tables are created in the database.
# It is a simple yet effective solution to avoid overflow issues in MSSQL
# while using Celery's database backend for storing task results.

models.Task.__table__.c.id.type = BigInteger()
models.TaskSet.__table__.c.id.type = BigInteger()

Actual Behavior

sqlalchemy.exc.DataError: (pyodbc.DataError) ('22003', '[22003] [Microsoft][ODBC Driver 18 for SQL Server][SQL Server]Arithmetic overflow error converting expression to data type int. (8115) (SQLExecDirectW)')
[SQL: INSERT INTO results_taskmeta (id, task_id, status, result, date_done, traceback, name, args, kwargs, worker, retries, queue) OUTPUT inserted.id VALUES (NEXT VALUE FOR task_id_sequence, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)]
[parameters: ('c8c7432b-255f-49c4-b546-c7954d935e40', 'PENDING', <pyodbc.NullParam object at 0x7fea2ec0e610>, datetime.datetime(2023, 11, 11, 22, 32, 27, 680391), None, None, <pyodbc.NullParam object at 0x7fea2ec0e610>, <pyodbc.NullParam object at 0x7fea2ec0e610>, None, None, None)]

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions