[parser] Drop deep expression chains iteratively to avoid stack overflow#25480
Closed
samuelcolvin wants to merge 1 commit into
Closed
[parser] Drop deep expression chains iteratively to avoid stack overflow#25480samuelcolvin wants to merge 1 commit into
samuelcolvin wants to merge 1 commit into
Conversation
The parser builds left-associative binary chains (`a + b + c + ...`) and postfix chains (`f()()...`, `a[0][0]...`, `a.b.c...`) with iterative loops, so their depth is bounded only by the input length rather than by the recursion guard. Such a tree parses fine, but the compiler-derived drop glue recurses once per level (~150 bytes of stack each) and overflows the native stack when the tree is dropped — a denial-of-service vector for any consumer that parses untrusted source on a normally-sized stack. Give the four node types that form these spines a manual `Drop` that walks the recursive edge (`left`/`func`/`value`), detaching each node and replacing the edge with a trivial placeholder before the node drops, so no drop ever recurses and a spine of any depth is torn down in O(1) stack. A single `iter_expr_drop` helper handles all four edges in one `match`, so a chain that mixes kinds is torn down in one flat loop. Unlike a parse-time depth cap, this rejects no valid Python and keeps ty's support for large flat expressions intact. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Memory usage reportMemory usage unchanged ✅ |
|
|
This was referenced May 30, 2026
Member
|
Thank you. We can consider that if we end up deciding against #25464. For now, I'd prefer focusing the conversation on that PR (I plan to open it for review soon). |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Alternative solution to #25462 using a custom drop implementation, and thereby doesn't limit left-associative binary chain limits.
AI Summary
The parser builds left-associative binary chains (
a + b + c + ...) and postfix chains (f()()...,a[0][0]...,a.b.c...) with iterative loops, so their depth is bounded only by the input length rather than by the recursion guard. Such a tree parses fine, but the compiler-derived drop glue recurses once per level (~150 bytes of stack each) and overflows the native stack when the tree is dropped — a denial-of-service vector for any consumer that parses untrusted source on a normally-sized stack.Give the four node types that form these spines a manual
Dropthat walks the recursive edge (left/func/value), detaching each node and replacing the edge with a trivial placeholder before the node drops, so no drop ever recurses and a spine of any depth is torn down in O(1) stack. A singleiter_expr_drophelper handles all four edges in onematch, so a chain that mixes kinds is torn down in one flat loop.Unlike a parse-time depth cap, this rejects no valid Python and keeps ty's support for large flat expressions intact.