Skip to content
This repository was archived by the owner on Apr 26, 2024. It is now read-only.

synapse explodes with a stack overflow and logging errors #4240

@richvdh

Description

@richvdh

We've had a bunch of reports that synapse goes into a meltdown with logging errors. For example (from #4086):

    2018-10-25 15:58:33,973 - twisted - 243 - ERROR - POST-299240- Traceback (most recent call last): 
    2018-10-25 15:58:33,973 - twisted - 243 - ERROR - POST-299240-   File "/usr/lib/python2.7/logging/handlers.py", line 76, in emit
    2018-10-25 15:58:33,974 - twisted - 243 - ERROR - POST-299240-     if self.shouldRollover(record):
    2018-10-25 15:58:33,974 - twisted - 243 - ERROR - POST-299240-   File "/usr/lib/python2.7/logging/handlers.py", line 156, in shouldRollover
    2018-10-25 15:58:33,974 - twisted - 243 - ERROR - POST-299240-     msg = "%s\n" % self.format(record)
    2018-10-25 15:58:33,974 - twisted - 243 - ERROR - POST-299240-   File "/usr/lib/python2.7/logging/__init__.py", line 734, in format 
    2018-10-25 15:58:33,975 - twisted - 243 - ERROR - POST-299240-     return fmt.format(record)
    2018-10-25 15:58:33,975 - twisted - 243 - ERROR - POST-299240-   File "/usr/lib/python2.7/logging/__init__.py", line 476, in format 
    2018-10-25 15:58:33,975 - twisted - 243 - ERROR - POST-299240-     raise e
    2018-10-25 15:58:33,977 - twisted - 243 - ERROR - POST-299240- UnicodeDecodeError: 'ascii' codec can't decode byte 0xd0 in position 46: ordinal not in range(128)
    2018-10-25 15:58:33,978 - twisted - 243 - ERROR - POST-299240- Logged from file _base.py, line 254
    2018-10-25 15:58:33,978 - twisted - 243 - ERROR - POST-299240- Traceback (most recent call last): 
    2018-10-25 15:58:33,978 - twisted - 243 - ERROR - POST-299240-   File "/usr/lib/python2.7/logging/__init__.py", line 861, in emit
    2018-10-25 15:58:33,978 - twisted - 243 - ERROR - POST-299240-     msg = self.format(record)
    2018-10-25 15:58:33,979 - twisted - 243 - ERROR - POST-299240-   File "/usr/lib/python2.7/logging/__init__.py", line 734, in format 
    2018-10-25 15:58:33,979 - twisted - 243 - ERROR - POST-299240-     return fmt.format(record)
    2018-10-25 15:58:33,979 - twisted - 243 - ERROR - POST-299240-   File "/usr/lib/python2.7/logging/__init__.py", line 476, in format 
    2018-10-25 15:58:33,979 - twisted - 243 - ERROR - POST-299240-     raise e
    2018-10-25 15:58:33,980 - twisted - 243 - ERROR - POST-299240- UnicodeDecodeError: 'ascii' codec can't decode byte 0xd0 in position 46: ordinal not in range(128)
    2018-10-25 15:58:33,980 - twisted - 243 - ERROR - POST-299240- Logged from file _base.py, line 254
    (etc)

And from #4191:

2018-11-15 12:24:45,358 - twisted - 243 - ERROR - POST-117962- Traceback (most recent call last): 
2018-11-15 12:24:45,359 - twisted - 243 - ERROR - POST-117962-   File "/usr/lib/python2.7/logging/handlers.py", line 76, in emit 
2018-11-15 12:24:45,360 - twisted - 243 - ERROR - POST-117962-     if self.shouldRollover(record):
2018-11-15 12:24:45,360 - twisted - 243 - ERROR - POST-117962-   File "/usr/lib/python2.7/logging/handlers.py", line 156, in shouldRollover
2018-11-15 12:24:45,360 - twisted - 243 - ERROR - POST-117962-     msg = "%s\n" % self.format(record)
2018-11-15 12:24:45,360 - twisted - 243 - ERROR - POST-117962-   File "/usr/lib/python2.7/logging/__init__.py", line 732, in format
2018-11-15 12:24:45,361 - twisted - 243 - ERROR - POST-117962-     return fmt.format(record)
2018-11-15 12:24:45,361 - twisted - 243 - ERROR - POST-117962-   File "/usr/lib/python2.7/logging/__init__.py", line 474, in format
2018-11-15 12:24:45,361 - twisted - 243 - ERROR - POST-117962-     s = self._fmt % record.__dict__
2018-11-15 12:24:45,361 - twisted - 243 - ERROR - POST-117962- UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 68: ordinal not in range(128)

(Note that both of the above reports are from python 2.7 instances. I don't know if it also applies to python 3.)

I think what's happening here is that we are attempting to log a non-ascii character, which then causes a logging error; the error is written to stderr, which is directed back into the logger, so we end up with a stack-overflow and high CPU usage.

However, I've been unable to reproduce the problem, and certainly non-ascii characters are normally logged without any problem. I don't really understand why self._fmt % record.__dict__ throws a UnicodeDecodeError.

Metadata

Metadata

Assignees

No one assigned

    Labels

    z-bug(Deprecated Label)

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions