A Django 4.2 on Python 3.10 application misbehaves sometimes, in a place that should be stateless. If I access the database (postgresql 14 via pcygopg 3.1.18) while working with pypdf 3.17.4, the output document is broken.. in roughly 1 out of 8 attempts. How can I determine the cause of the erratic behavior?
The resulting PDF has missing content, so comparing output size is sufficient to determine the bug was triggered:
# call as: python3 manage.py minrepro input.pdf
import argparse, io
from django.core.management.base import BaseCommand
from pypdf import PdfReader, PdfWriter
from djangoapp.models import DjangoModel
def main(fin):
pdfout = PdfWriter()
pageout = pdfout.add_blank_page(width=200, height=200)
for i in range(8):
# Note: accessing the database *during* PDF merging is relevant!
# without the next line, the problem cannot be reproduced
for c in range(31): a = DjangoModel.objects.first()
fin.seek(0)
for pagein in PdfReader(fin, strict=True).pages:
pageout.merge_page(pagein)
with io.BytesIO() as fout:
pdfout.write(fout)
return fout.tell()
class Command(BaseCommand):
def add_arguments(self, parser):
parser.add_argument(dest="pdf", type=argparse.FileType("rb") )
def handle(self, *args, **options):
for i in range(30):
if i == 0: first_size = main(options["pdf"])
current_size = main(options["pdf"])
if not first_size == current_size:
print(f"presumed stateless call was not {i=}, {first_size=} != {current_size=}")
- My fist theory was pypdf failing to generate truly unique names for (post-merge) duplicate identifiers, and the DB access merely giving the opportunity (delay) for that to become noticeable. I looked for any
timeorrandomimports in the relevant libraries, no obvious bugs there. Also, nothing liketime.sleep(0.02)could not make my reproducer work, only database access. - This is CPython with insert-ordered dicts, most code should behave the same on identical input. Also called python with
-W module::ResourceWarning -W module::DeprecationWarning -W module::PendingDeprecationWarningto be more likely notified of any mistakes on my part, nothing. - Have reproduced the behavior in a single-threaded management command, it is therefore not related to the any http/web server issue.
time.sleep(0.002)?