fix: repetitive tracebacks after "socket.send() raised exception"#11284
Merged
fix: repetitive tracebacks after "socket.send() raised exception"#11284
Conversation
a37c5ac to
b52538a
Compare
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
dmitryduev
approved these changes
Feb 6, 2026
b52538a to
96cc862
Compare
timoffex
added a commit
that referenced
this pull request
Feb 7, 2026
) To reraise a saved exception, it's important to reset its traceback. I haven't heard of anyone ever hitting this particular exception, but it's good to use the correct patterns. See PR #11284.
b695f5f to
ffaff72
Compare
ffaff72 to
1e36296
Compare
1 task
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

Fixes huge tracebacks output after a
wandb-corecrash.Fixes WB-31072.
The tracebacks have long repeated sections because asyncio's StreamReader/StreamWriter store and re-raise an exception. When
wandb-corecrashes, the next communication attempt raises aConnectionResetError. This usually triggers error handling logic that attempts to send even more data towandb-coreas it cannot distinguish between awandb-corecrash and other types of errors. Each time the exception is raised, the traceback is lengthened.I'm not sure how exactly, but this can easily result in extremely long tracebacks (megabytes of text).Since eachraisestatement during stack unwinding mutates an exception's traceback, the traceback is lengthened by the entire callstack leading up to theraise saved_exceptionstatement each time it's triggered.This problem may be fixed in Python >= 3.11, but I haven't tested. See https://bugs.python.org/issue45924.
Testing
To reproduce the issue, print and call
run.log()in a loop, and thenpkill wandb-core. Without the fix, this outputs manysocket.send() raised exceptionlines and exception tracebacks with this repeated section: